Data Alignment when Migrating to 64-Bit Intel® Architecture

 

Introduction

The introduction of Intel® Extended Memory 64 Technology (Intel® EM64T) and continuing advancements in the Intel® Itanium® processor have broadened the field of 64-bit computing considerably. These advances make the 64-bit operating environment on Intel® architecture a highly desirable destination for the migration of existing 32-bit code.

The Itanium processor supports this migration in two key ways: it runs 32-bit x86 code natively, and it makes source-code porting to full 64-bit implementations easy. Tools available today from Intel, Microsoft and the open-source community make the migration of source code fairly straightforward.

Moreover, once code has been ported successfully to the Itanium Processor Family, it often requires no more than a recompile to run under a 64-bit operating system on Intel EM64T. Some primary caveats and limitations to this interoperability concern assembly, intrinsics, and macros; details are available from the Industry Developer's Guide.


Alignment of Data Items

Developers who undertake the migration to the 64-bit operating environment will discover that a small set of issues tend to recur most frequently, on both Win64* and Linux* platforms.

One of these is the alignment of data items – their location in memory in relation to addresses that are multiples of four, eight or 16 bytes. Under the 16-bit Intel architecture, data alignment had little effect on performance, and its use was entirely optional. Under IA-32, aligning data correctly can be an important optimization, although its use is still optional with a very few exceptions, where correct alignment is mandatory. The 64-bit environment, however, imposes more-stringent requirements on data items. Misaligned objects cause program exceptions. For an item to be aligned properly, it must fulfill the requirements imposed by 64-bit Intel architecture (discussed shortly), plus those of the linker used to build the application.

The fundamental rule of data alignment is that the safest (and most widely supported) approach relies on what Intel terms "the natural boundaries." Those are the ones that occur when you round up the size of a data item to the next largest size of two, four, eight or 16 bytes. For example, a 10-byte float should be aligned on a 16-byte address, whereas 64-bit integers should be aligned to an eight-byte address. Because this is a 64-bit architecture, pointer sizes are all eight bytes wide, and so they too should align on eight-byte boundaries.

It is recommended that all structures larger than 16 bytes align on 16-byte boundaries. In general, for the best performance, align data as follows:

  • Align 8-bit data at any address
  • Align 16-bit data to be contained within an aligned four-byte word
  • Align 32-bit data so that its base address is a multiple of four
  • Align 64-bit data so that its base address is a multiple of eight
  • Align 80-bit data so that its base address is a multiple of sixteen
  • Align 128-bit data so that its base address is a multiple of sixteen


A 64-byte or greater data structure or array should be aligned so that its base address is a multiple of 64. Sorting data in decreasing size order is one heuristic for assisting with natural alignment. As long as 16-byte boundaries (and cache lines) are never crossed, natural alignment is not strictly necessary, although it is an easy way to enforce adherence to general alignment recommendations.

Aligning data correctly within structures can cause data bloat (due to the padding necessary to place fields correctly), so where necessary and possible, it is useful to reorganize structures so that fields that require the widest alignment are first in the structure. More on solving this problem appears in the article "Preparing Code for the IA-64 Architecture (Code Clean)."


Greater Flexibility with Itanium® Processors

The restrictions on data alignment have been slightly eased in the Itanium processor, relative to the Itanium processor, if the proper flags are set in code. (Check your manuals for the exact settings.) The relevant section from Intel® Itanium® Processor Reference Manual (Section 5.5) states: "The Itanium processor implementation supports arbitrary load and store accesses except for integer accesses that cross eight-byte boundaries and any accesses that cross 16-byte boundaries." It explains that data items can occur within an aligned window, which simply reiterates the requirement that boundaries not be crossed.

For this less-stringent scheme to work, the linkers and operating system must go along with these specifications. If they enforce the tighter requirements, you will have to do so as well. So test before you make any decisions on alignment, or simply use the path of greatest portability and safety imposing the tighter alignment requirements on your data fields.

Note: It may occur to some readers that because the Itanium Processor Family supports native execution of IA-32 binary code, data-alignment exceptions could make IA-32 binaries incapable of running. When the processor is in 32-bit mode, however, it disables the generation of alignment exceptions for the code under execution, so this problem does not exist. This option, however, is not available in 64-bit mode.


Additional Resources

  • The Intel® Software Partner Program can be beneficial to ISVs who are porting to Intel architectures by providing developers with access to the latest technologies and the expertise of software engineers, for a relatively low cost.
  • Itanium® Processor Developer Center contains a wealth of information to help put the outstanding performance of the Itanium® processor to work on mission-critical applications.
  • Porting on 64-Bit Intel® Architecture provides a high-level examination of porting applications to Intel EM64T, including capabilities of the technology, as well as related caveats and limitations.
  • Take Advantage of the Memory Features of 64-Bit Computing - Correct memory management is vital to the design and execution of applications on any platform. Gain sound practices to help manage the greater memory resources gained when porting applications to the Itanium® Processor Family or Intel® Extended Memory 64 Technology.

 

For more complete information about compiler optimizations, see our Optimization Notice.