Memory Allocation and First-Touch

By Amanda K Sharp,

Published: 11/07/2013   Last Updated: 09/09/2014

Compiler Methodology for Intel® MIC Architecture

Memory Allocation and First-Touch

Memory allocation is expensive on the coprocessor compared to the Intel® Xeon processor so it is prudent to reuse already-allocated memory wherever possible. For example, if a function gets called repeatedly (say inside a loop), and this function uses an array for temporary storage, try to allocate the array (of maximum size needed) the first time and reuse that array in later calls:

static real *temp_array=0;

void foo(..) {
if (temp_array == 0) {
    temp_array = my_malloc(MAX_SIZE);
... // use of temp_array

Also, keep in mind that the physical memory allocation on Linux happens at the first touch (and not at the malloc-point). So, if you have a loop that traverses a previously malloced (but untouched) array, the first iteration may take a longer time than the rest.

Take Aways

Memory reuse is important for good performance on Intel MIC Architecture. Be mindful of how temporary arrays are allocated and used in your code.


It is essential that you read this guide from start to finish using the built-in hyperlinks to guide you along a path to a successful port and tuning of your application(s) on Intel® Xeon Phi™Coprocessors.  The paths provided in this guide reflect the steps necessary to get best possible application performance.

Back to Advanced MIC Optimizations chapter

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at