Introduction.
data
Graceful Enhancement
Replace a Set of Pointers With a Base Pointer to Reduce Data Bloat
Challenge
Reduce data bloat due to the use of many pointers. Pointers in the Itanium® architecture are twice the size of pointers in 32-bit Intel® architecture, which may effectively double the size of data structures that are largely composed of pointers.
The following code defines a structure composed entirely of pointers:
|
Manage Structure Padding to Avoid Data Bloat
Challenge
Reduce or eliminate data bloat due to structure padding. With the Itanium® architecture, data boundaries are naturally aligned, instead of freely (any-byte) aligned as on 32-bit Intel® architecture. Depending on the field order in a 64-bit struct, this change in boundaries may lead to padding of 32-bit fields, causing data bloat.
If the following Win32* code were compiled for the 64-bit Intel architecture, the two variables height and weight would be padded, because they are 32-bit variables bounded by 64-bit boundaries:
Manipulate Data Structure to Optimize Memory Use on 32-Bit Architecture
Challenge
Improve memory utilization by manipulating data-structure layout. For certain algorithms, like 3D transformations and lighting, there are two basic ways of arranging the vertex data. The traditional method is the array of structures (AoS) arrangement, with a structure for each vertex, as shown below:
Loop Blocking to Optimize Memory Use on 32-Bit Architecture
Challenge
Improve memory utilization by means of loop blocking. The main purpose of loop blocking is to eliminate as many cache misses as possible. Consider the following loop, as it exists before blocking:
|
Avoid Partial Memory Accesses on 32-Bit Intel® Architecture
Challenge
Avoid partial memory accesses. Consider a case with large load after a series of small stores to the same area of memory (beginning at memory address mem). The large load will stall in this case as shown here:
mov mem, eax ; store dword to address “mem" mov mem + 4, ebx ; store dword to address “mem + 4" : : movq mm0, mem ; load qword at address “mem", stalls |
Manipulate Data Structure to Optimize Memory Use on 32-Bit Intel® Architecture
Challenge
Improve memory utilization by manipulating data-structure layout. For certain algorithms, like 3D transformations and lighting, there are two basic ways of arranging the vertex data. The traditional method is the array of structures (AoS) arrangement, with a structure for each vertex, as shown below:
|
