| Last Modified On : | May 7, 2008 2:48 AM PDT |
Rate |
|
Improve memory utilization by means of strip mining. Strip mining, also known as loop sectioning, is a loop-transformation technique for enabling SIMD-encodings of loops, as well as providing a means of improving memory performance. First introduced for vectorizers, this technique consists of the generation of code when each vector operation is done for a size less than or equal to the maximum vector length on a given vector machine. Consider the following pseudocode, as it exists before strip mining:
typedef struct _VERTEX {
|
The main loop consists of two functions:
For each object, the main loop calls a transformation routine to update some data, then calls the lighting routine to further work on the data. If the size of array v[Num] is larger than the cache, then the coordinates for v[i] that were cached during Transform(v[i]) will be evicted from the cache by the time we do Lighting(v[i]). This means that v[i] will have to be fetched from main memory a second time, reducing performance.
Fragment large loops into smaller segments or strips, thus transforming the loop structure twofold:
In the following code, the computation has been strip-mined to a size strip_size. The value strip_size is chosen such that strip_size elements of array v[Num] fit into the cache hierarchy. By doing this, a given element v[i] brought into the cache by Transform(v[i]) will still be in the cache when we perform Lighting(v[i]), and thus improve performance over the non-strip-mined code.
main() |
The following items are related to this one:
IA-32 Intel® Architecture Optimization Reference Manual

Patrick Beard
for (i=0; i<Num; i++) {
Transform(v[i]);
Lighting(v[i]);
}
Have the same effect? Of course many real world programs can't be transformed so easily.