GAP Message - remark #30756: (DTRANS) Split the structure ... into two parts to improve data locality ...

Message

Split the structure 'str' into two parts to improve data locality. Frequently accessed fields are 'a1,b1,c1'; performance may improve by putting these fields into one structure and the remaining fields into another structure. Alternatively, performance may also improve by reordering the fields of the structure. Suggested field order: 'a1, c1, e1, b1, carr'. [VERIFY] The suggestion is based on the field references in the current compilation. Please make sure that the restructuring is applied to field references in all source files of the application, and that the restructured code satisfies the original program semantics.
Applies to C/C++ only

Description

This message is issued when both structure splitting and field reordering transformations are applicable. Structure splitting transformation is expected to lead to higher performance gains if the transformation can be successfully applied. However, field reordering transformation is usually simple enough to apply, but the downside is that the performance gain seen may be lower.
You must verify that the structure meets the requirements for applying the splitting or reordering transformation. Some of these requirements are described in the description of these individual transformations.

Example

//str_split_reord.c
struct str {
    int a1, b1, carr[100], c1, e1;
};

#define N 1000000

struct str *sp;

void allocate_str_mem()
{
    sp = malloc(N * sizeof(struct str));
}

int hot_func1() {
    int i, ret = 0;

    for (i = 0; i < 1000000; i++) {
        ret += sp[i].a1;
        ret += sp[i].c1;
    }
    sp-<carr[0] = ret;
    return ret;
}

int hot_func2() {
    int ret = 0, i;
    for (i = 0; i < 100000; i++) {
        ret += sp[i].a1;
        ret -= sp[i].e1;
    }
    return ret;
}

int hot_func3() {
    int ret = 0, i;
    for (i = 0; i < 1000000; i++) {
        ret += sp[i].b1;
    }
    return ret;
}

For the above example, the compiler generates the following advice with
'icl -Qguide=4 -c str_split_reord.c'

 

str_split_reord.c(2): remark #30756: (DTRANS) Split the structure 'str' into two parts to improve data locality. Frequently accessed fields are 'a1, b1, c1'; performance may improve by putting these fields into one structure and the remaining fields into another structure. Alternatively, performance may also improve by reordering the fields of the structure. Suggested field order: 'a1, c1, e1, b1, carr'. [VERIFY] The suggestion is based on the field references in the current compilation. Please make sure that the restructuring is applied to field references in all source files of the application, and that the restructured code satisfies the original program semantics.\n

 

The above example can be modified as below to split the structure 'str' as suggested. Other references, which are not in the current module, to structure 'str' should also be modified similarly.

struct str_cold {
    int carr[100], e1;
};

struct str {
    int a1, b1, c1; struct str_cold *cold_ptr;
};

#define N 1000000

struct str *sp;

void allocate_str_mem()
{
    struct str *temp;
    struct str_cold *cold_begin;
    int index;

    temp = malloc(N * sizeof(struct str) + N * sizeof(struct str_cold));
    sp = temp;
    cold_begin = (struct str_cold *) (temp + N);
    for(index = 0; index < N; index++) {
       temp[index].cold_ptr = cold_begin + index;
    }
}

int hot_func1() {
    int i, ret = 0;

    for (i = 0; i < 1000000; i++) {
        ret += sp[i].a1;
        ret += sp[i].c1;
    }
    sp-<cold_ptr-<carr[0] = ret;
    return ret;
}

int hot_func2() {
    int ret = 0, i;
    for (i = 0; i < 100000; i++) {
        ret += sp[i].a1;
        ret -= sp[i].cold_ptr-<e1;
    }
    return ret;
}

int hot_func3() {
    int ret = 0, i;
    for (i = 0; i < 1000000; i++) {
        ret += sp[i].b1;
    }
    return ret;
}

For the above example, the only source change required to reorder fields in structure 'str' as alternatively suggested are the following:

//str_split_reord.c
struct str {
    int a1, c1, e1, b1, carr[100];
};

...
...

Optimization Notice in English

Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.
Tags: