buffer overflow and caches

buffer overflow and caches

i have a small doubt concerning the behavior of cache in a buffer overflow attack.

consider the following sequence of events:
1. suppose stack is in d-cache.
2. assume no context switch occurs.
3. return statement causes a return to a location on the stack and %eip points to stack address.
4. processor looks for stack address(code) in i-cache.
5. L1 cache miss. bring old stack from L2.
6. so new stack has not been written back to memory and old stack is loaded into i-cache.
7. stack coexisting in both d and i cache.

only one of the changes is actually written back.
moreover since stack in i-cache loaded from L2 cache.

If such a sequence of events occurs then the buffer overflow attack will be foiled possibly with an invalid machine instruction exception.
is this sequence of events possible?

i understand this can now be handled by removing execute permissions from stack code but otherwise how was this handled in the pentium based systems.

thank you

7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Quoting - iamrohitbanga

i have a small doubt concerning the behavior of cache in a buffer overflow attack.

consider the following sequence of events:
1. suppose stack is in d-cache.
2. assume no context switch occurs.
3. return statement causes a return to a location on the stack and %eip points to stack address.
4. processor looks for stack address(code) in i-cache.
5. L1 cache miss. bring old stack from L2.
6. so new stack has not been written back to memory and old stack is loaded into i-cache.
7. stack coexisting in both d and i cache.

only one of the changes is actually written back.
moreover since stack in i-cache loaded from L2 cache.

If such a sequence of events occurs then the buffer overflow attack will be foiled possibly with an invalid machine instruction exception.
is this sequence of events possible?

i understand this can now be handled by removing execute permissions from stack code but otherwise how was this handled in the pentium based systems.

thank you

This is where ring-based protection comes in. Also paging memory model helps to sort this out as this way the stack gets its own set of addresses. So theoretically that means that, if managed correctly, it is possible to have an overflow that does not affect anything else due to this memory model.

Many operating systems do not use this and "cheat" by using flat memory model. So this is what makes them more vulnerable to stack and buffer overflow attacks.

Removing execute permissions is not a bad start, but is only a "first step" to protecting the stack. And from what I understand 32-bit Pentium processors do not have hardware DEP or NX bits (maybe I am wrong)

OK
but the more important point is how is it decided what goes into d-cache and what goes into i-cache.

if the operands are assumed to be data then putting them into d-cache, modifying them to contain instructions and jumping to this position would cause a i-cache miss when things are actually cached in d-cache. thus both d-cache and i-cache contain the same data which is inconsistent.

so finally the question is how to decide what should go into d and i cache.

... i just found out that snooping is used to maintain coherency of cache among various processors. it could be used for maintaining coherency between d and i caches as well.
http://en.wikipedia.org/wiki/Cache_coherency
is anyone sure of the exact implementation.

Quoting - iamrohitbanga
OK
but the more important point is how is it decided what goes into d-cache and what goes into i-cache.

if the operands are assumed to be data then putting them into d-cache, modifying them to contain instructions and jumping to this position would cause a i-cache miss when things are actually cached in d-cache. thus both d-cache and i-cache contain the same data which is inconsistent.

so finally the question is how to decide what should go into d and i cache.

... i just found out that snooping is used to maintain coherency of cache among various processors. it could be used for maintaining coherency between d and i caches as well.
http://en.wikipedia.org/wiki/Cache_coherency
is anyone sure of the exact implementation.

Modifying the data in d-cache to turn them into instructions and then trying to execute them will simply not work as the processor does not have a way of "guessing" whether it is an instruction or not. If it tried to do this there would be a security issue. Data can be executed. And this is not necessarily a good thing.

The best way to solve this is to create a set of NOPs and then modify them. This will ensure that you are modifying the i-cache data and not the data in the d-cache.

The rest of the question is confusing. If you can provide an example of exactly what you are attempting to do then it will be easy to understand which path to take.

Quoting - Adam Kachwalla

Modifying the data in d-cache to turn them into instructions and then trying to execute them will simply not work as the processor does not have a way of "guessing" whether it is an instruction or not. If it tried to do this there would be a security issue. Data can be executed. And this is not necessarily a good thing.

The best way to solve this is to create a set of NOPs and then modify them. This will ensure that you are modifying the i-cache data and not the data in the d-cache.

The rest of the question is confusing. If you can provide an example of exactly what you are attempting to do then it will be easy to understand which path to take.

security issue: yes but this protection was not present in earlier pentium based systems (referring to Chapter 10, Computer Systems by Bryant)

what i wish to understand is the strategy adopted in those classes of systems to synchronize data between d and i cache while using a write through approach.

how to decide which cache i or d should contain data

i'll have to work out an example. it'll take some time.

here's a quick one though

#syntax is gcc based
mov %esp, %eax
push $(0x23843443)
push $(0x23424345)
.
.
.
#stack is in d-cache now
jmp (%eax)
# d-cache miss, bring old stack from L2 cache in i-cache
# stack is present in both d-cache and i-cache
# assume DEP not enabled

set of NOPS: yes but i want to understand the cache management scheme for separation of content for d and i cache.

Quoting - iamrohitbanga
security issue: yes but this protection was not present in earlier pentium based systems (referring to Chapter 10, Computer Systems by Bryant)

what i wish to understand is the strategy adopted in those classes of systems to synchronize data between d and i cache while using a write through approach.

how to decide which cache i or d should contain data

i'll have to work out an example. it'll take some time.

here's a quick one though

#syntax is gcc based
mov %esp, %eax
push $(0x23843443)
push $(0x23424345)
.
.
.
#stack is in d-cache now
jmp (%eax)
# d-cache miss, bring old stack from L2 cache in i-cache
# stack is present in both d-cache and i-cache
# assume DEP not enabled

set of NOPS: yes but i want to understand the cache management scheme for separation of content for d and i cache.

Congratulations on the green belt.

The protection was software-based. It may not have been as good as it is now, but it is still better than nothing.

The method for cache handling was not the best in the older processors. IIRC it would use data from d-cache as it is referenced by instructions.

In your example above, it would only worry about the stack in d-cache. OK I'll try and explain this:

  1. Instruction block loaded into i-cache
  2. Instructions executed in sequential order.
  3. As they reference data that is in RAM, if the item does not exist in d-cache it will try and find it in i-cache.
  4. If not found in i-cache, it will look in RAM.

This is basically from what I remember of how they manage what goes into what cache. Some instructions, as a result of this scheme, will of course land in d-cache.

As for writing back:

  1. Modifications are recorded in d-cache
  2. When any operation that requires cache flush is performed, the d-cache is written back to RAM.

Setting IP to point to a data location, from my understanding, should move the d-cache data into i-cache. This is where the buffer execution vulnerabilities come in.

All this is how data is loaded into the respective caches. As for how data is written back, I am not exactly sure.

Strictly speaking the synchronization of caches should be transparent. And it is usually the d-cache that stores modifications and data addresses while the i-cache stores the instructions to be executed only. I think read operations will obtain data from d-cache if it is present in both caches as well.

could you cite some references

Leave a Comment

Please sign in to add a comment. Not a member? Join today