Erroneous use of __sec_implicit_index() ?

Erroneous use of __sec_implicit_index() ?

I want to pass the index of array to routine, that uses it inside to compute and return some result, that should be assigned to output array member.

I mean the following:

for(int i=0; i<I_max; i++)
     a_output[i] = foo(i, a1,a2, ...);

Instead of this loop I tried to use CILK+ Array Notation construction:
a_output[0:I_max] = foo(__sec_implicit_index(0) , a1,a2, ...);

This lines cause the error (compiler Intel C++ Compiler XE 14.0):

1>  CilkArrNot_test.cpp
1>D:\CilkArrNot_test\CilkArrNot_test\CilkArrNot_test.cpp(37): warning #18024: implicit index must be used in an array section context
1>" : error : ** segmentation violation signal raised **

What is the problem in the code line above ?

Maybe it is a bad idea to use __sec_implicit_index(0) for such a goal ? What is the right way to collapse the given loop to array section operation ?

9 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

The code looks like correct use of array notation.  I recommend filing a bug report against the compiler. (I can help if you do not have access to the compiler bug-reporting system.)  It worked for me with icl 14.0.1.103 and icc 14.0.1.106, using the small example after my signature.

- Arch


const int n = 100;
float a[n], b[n];

float foo( int i, float c ) {
    return i*i+c;
}

void bar(int n, float b ) {
    a[0:n] = foo(__sec_implicit_index(0),b);
}

#include <cstdio>

int main() {
    bar(n,2.0f);
    for( int i=0; i<n; ++i )
        if( a[i]!=i*i+2 )
            std::printf("ERRORn");
}

 

Hi Arch,

Please, look at the code that seems to be even simpler than you suggested to test:

#include <cstdio>
#include <cilk/cilk.h>
const int n = 100;

float foo(int i)
{
 std::printf("Index i=%d", i);
 return i*i;
}

int main()
{
 float a[n];
 a[0:n] = foo(__sec_implicit_index(0));
}
 

Result of compilation (VisStudio-2013, compiler Intel(R) 64, Version 14.0.1.139 Build 20131008) is the following:

1>D:\Philips\CilkArrNot_test\CilkArrNot_test\CilkArrNot_test.cpp(18): warning #18024: implicit index must be used in an array section context
1>  xilink: executing 'link'
1>CilkArrNot_test.obj : error LNK2019: unresolved external symbol ___sec_implicit_index referenced in function _main
1>D:\Philips\CilkArrNot_test\Release\CilkArrNot_test.exe : fatal error LNK1120: 1 unresolved externals

If the printf() line in foo() is commented out - the compilation succeeds ! And even more strange behavior: if replace 'i' in the printf() call to some constant ('0', for example) - the compilation succeeds as well !

Highly interesting - are you able to reproduce it on your system ?

Such oddities are common with modern compilers, and are very often option-dependent (especially with optimisation options).  What options are you using?

 

Optimisation is "Maximize Speed (/O2)", all Intel-compiler specific optimisation options are "No" or "Default" (the most "dangerous" IPO is "No").

By the way, the behavior described above does exist for "Release" configuration only, for "Debug" it isOK ...

What more options you'd suggest to check ?

I am second to Arch: the code is correct and should work. Issue report is to be filed against compiler.

The source of the issue is that a[] is unused. I added following line to the end of the code:

int main()
 {
  float a[n];
  a[0:n] = foo(__sec_implicit_index(0));

  return __sec_reduce_add(a[:]); // <<<<<< This is to use a[]
 }

And the code started working.

One characteristic of such bugs is that ANY option that affects code generation can make them appear and disappear: optimisation, floating-point model, whether symbols are generated, etc.  So they can be very hard to repeat unless you know those!
 

Thank you, Nick and Serge, for your interest to the case !

Arch succeded to reproduce wrong behavior on Linux and finally submitted bug report to compiler team.

My examples using __sec_implicit_index() give full performance on the corei7-4 laptop, but on Westmere, corei7 and MIC they are slow.  It seems to make no difference which /arch is set (even SSE runs reasonably fast on the AVX2 laptop):

float a[],b[],c__[]

a[1:i__2] = b[1:i__2] + c__[1:i__2] * (__sec_implicit_index(0)+1);

a[1:i__2] = (__sec_implicit_index(0)+1)*2 * b[1:i__2];

In the generated code for corei7 there are cvtsi2ss instructions in the otherwise vectorized loop.  Do those incur partial register stalls up through Westmere?

Code generation for initializing an identity matrix, such as

cilk_for(int j = 0; j < N; ++j)

     aa[j][:] = __sec_implicit_index(0) == j;

seems relatively good.  I'm still wavering about whether this is a good idea compared with old-fashioned source code.

Leave a Comment

Please sign in to add a comment. Not a member? Join today