gsl library compiled using -mmic -O2 coredupms

gsl library compiled using -mmic -O2 coredupms

hi all,

I'm porting some software to the Xeon Phi that's using gsl. I've downloaded gsl 1.16 and configured and compiled it using

  ./configure --host=x86_64-unknown-linux-gnu CC=icc CXX=icpc CFLAGS="-mmic -O2"

(using icc 14.0.3 20140422)

The code compiles OK but the test code coredumps on the Xeon Phi itself;  there are multiple components of gsl that coredump, one of them is 'vector':

mic0> gdb ./test
GNU gdb (GDB) 7.5+mpss3.2.3

(gdb) r
Starting program: /home/janjust/src/gsl-1.16/vector/test

Program received signal SIGSEGV, Segmentation fault.
0x000000000040f026 in test_complex_func (stride=16, N=32) at test_complex_source.c:121
121            if (v->data[2*i*stride] != (ATOMIC) (i) || v->data[2*i*stride + 1] != (ATOMIC) (i + 1234))

The weird thing is that the function where it never crashes is never called using stride=16, N=32 so it seems the optimizer altered something.

If I remove the "-O2" then the code runs OK. The same code with CC=icc CFLAGS="-O2" runs fine on the host CPU (Xeon E5). Is this a compiler optimisation error? how do I 'downgrade' the compiler optimisation for a particular piece of code?  How can I further troubleshoot this?




4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

There should be no difference between "-mmic -O2" and just "-mmic" since -O2 is the default.

You can optimize at the routine level with a #pragma optimize documented here.

The symptoms of SegV on the coprocessor vs. success on the host CPU suggest a possible unaligned access on the coprocessor. I’ve seen an ASSERT used in other cases to help detect unaligned addresses but I don't know whether this works for structure members. It might help determine if one or more accesses of v are unaligned. I will inquire w/others. Maybe something like this:

V_addr1 = &(v->data[2*i*stride])
ASSERT(V_addr1 %64 ==0);

V_addr2 = &(v->data[2*i*stride + 1])
ASSERT(V_addr2 %64 ==0);


Here is some guidance from Development regarding the details of your earlier post:

Normally, in plain C code without intrinsics the addresses are not required to be 64-byte aligned – only element-wise alignment is required (for example, pointer to ‘double’ must have 8-byte alignment). I believe the ‘data’ array is allocated using ‘malloc’ so it should be already aligned properly.

This segV might be due to a bug in vectorizer, so I suggest trying newer compiler (15.0) or disabling vectorization of the particular loop around the problematic line:

121            if (v->data[2*i*stride] != (ATOMIC) (i) || v->data[2*i*stride + 1] != (ATOMIC) (i + 1234))

To disable vectorization of the loop, #pragma novector can be used, as follows:

#pragma novector
for (i = 0; i < N; i++)

Please let me know whether this is still reproducible with 15.0 and if so whether I can get a reproducer to provide Development for further investigation and development of any associated fix.

I've upgraded mpss to 3.3 , installed icc v15.0 and reran my test - the coredumps are now gone, but there are some new failing tests (not coredumps, just wrong results). I will open a new ticket for them.

By adding a few '#pragma novector' lines and a few hacks to the gsl test scripts I am now  able to run all gsl tests successfully on a Xeon Phi!


Leave a Comment

Please sign in to add a comment. Not a member? Join today