Intel C++ 13 Difference with -O2: Unsigned Integer Constant Propagation

Intel C++ 13 Difference with -O2: Unsigned Integer Constant Propagation

We are seeing a change/regression moving from Intel version 12 to version 13 of the C++ compiler on a Linux platform.  This occurs at the -O2 level of optmization (disappears at -O1), and appears on both Intel and AMD processors.  Code and makefile appended.  Can someone verify this (apparently broken) behavior?  Should we report this as a bug, or is this code straying into an area not defined by the standard?

The reproducer declares an instance of C, initializes it with a value set via a preprocessor macro, then tests it, increments it, and tests it again. Intel 12.1.5 (correctly) seems to do a thorough job of constant propagation and eliminates the conditionals in favor of just printing two values and a final "Test: PASSED" message.

Intel 13 doesn't seem to do the same level of constant propagation and instead generates code that actually carries out the operations expressed in the reproducer. A disassembly of the relevant section of the "big" executable shows:

400947: 66 0f 6f 05 41 05 00 movdqa 0x541(%rip),%xmm0 # 400e90
40094e: 00
40094f: 48 8b 35 3a 05 00 00 mov 0x53a(%rip),%rsi # 400e90
400956: 0f ae 14 24 ldmxcsr (%rsp)
40095a: 66 0f 7f 44 24 10 movdqa %xmm0,0x10(%rsp)
400960: e8 83 fe ff ff callq 4007e8 <std::basic_ostream<char, std::char_traits<char> >::operator<
400965: 48 89 c7 mov %rax,%rdi
400968: be 08 08 40 00 mov $0x400808,%esi
40096d: e8 86 fe ff ff callq 4007f8 <std::basic_ostream<char, std::char_traits<char> >::operator<
400972: 48 8b 74 24 10 mov 0x10(%rsp),%rsi
400977: 41 bd 01 00 00 00 mov $0x1,%r13d
40097d: 33 db xor %ebx,%ebx
40097f: 48 81 fe ff ff ff 7f cmp $0x7fffffff,%rsi

The first movdqa instruction seems to load data from 0x400e90, in the .rodata section; sure enough, the value at that address is 0x000000007fffffff, the comparison at 0x40097f is true, and the "big" executable produces the expected output:

% ./big
7fffffff
80000000
Test: PASSED

A disassembly of the "bigger" executable looks similar:

400947: 66 0f 6f 05 51 05 00 movdqa 0x551(%rip),%xmm0 # 400ea0
40094e: 00
40094f: 48 8b 35 4a 05 00 00 mov 0x54a(%rip),%rsi # 400ea0
400956: 0f ae 14 24 ldmxcsr (%rsp)
40095a: 66 0f 7f 44 24 10 movdqa %xmm0,0x10(%rsp)
400960: e8 83 fe ff ff callq 4007e8 <std::basic_ostream<char, std::char_traits<char> >::operator<
400965: 48 89 c7 mov %rax,%rdi
400968: be 08 08 40 00 mov $0x400808,%esi
40096d: e8 86 fe ff ff callq 4007f8 <std::basic_ostream<char, std::char_traits<char> >::operator<
400972: 48 bb 00 00 00 80 00 mov $0x80000000,%rbx
400979: 00 00 00
40097c: 41 bc 01 00 00 00 mov $0x1,%r12d
400982: 48 8b 74 24 10 mov 0x10(%rsp),%rsi
400987: 48 3b de cmp %rsi,%rbx

but this time the value at 0x400ea0 is 0xffffffff80000000, so the cmp at 0x400987 ends up comparing 0x80000000 (in %rbx) with 0xffffffff80000000 (in %rsi). The output of "bigger" is

% ./bigger
ffffffff80000000
ffffffff80000001
Test: FAILED

It seems to fail only with values larger than 0x7fffffff.

---- reproducer.cc ----
#include <ios>
#include <iostream>

struct C
{
C(unsigned long d1)
{
data[0] = d1;
data[1] = 0;
data[2] = 0;
data[3] = 0;
}

unsigned long data[4];
};

using namespace std;

int main(void)
{
int failures = 0;
C val( VAL );

cout << hex << val.data[0] << endl;
if (val.data[0] != VAL ) ++failures;

++val.data[0];

cout << val.data[0] << endl;
if (val.data[0] != VAL + 1) ++failures;

if (failures > 0) cout << "Test: FAILED" << endl;
else cout << "Test: PASSED" << endl;

return 0;
}

---- Makefile ----
CXXFLAGS := -g -O2 -Wall

.PHONY: all clean

all: big bigger

big: CXXFLAGS += -DVAL=0x7fffffffUL
big: reproducer.cc
$(CXX) $(CXXFLAGS) $^ -o $@

bigger: CXXFLAGS += -DVAL=0x80000000UL
bigger: reproducer.cc
$(CXX) $(CXXFLAGS) $^ -o $@

clean:
-$(RM) big bigger *.o

7 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Thanks for the description and test-case!

PS: Guys, please take into account, when trying to compile the test-case, that VAL is a macro and defined in a command line (!):
...
big: CXXFLAGS += -DVAL=0x7fffffffUL
...

Hello,

I can confirm this problem and forwarded it to engineering (DPD200240531). As soon as this is being fixed I'll let you know.
It might also be related to this: http://software.intel.com/en-us/forums/topic/358200

Best regards,

Georg Zitzlsberger

Thank you, Georg.  

    Rob Cunningham

Hello,

sorry for the late response. We fixed the reported problem with Intel(R) Composer XE 2013 SP1 and later.

Best regards,

Georg Zitzlsberger

Georg,   forgive my ignorance -- does that translate into a specific version of the compiler?   We have up to version 14.0.1 available on our HPC clusters, so I'm hoping for advice on which release to try first.  Thanks for your help.

Rob Cunningham

Georg's comment would indicate that all 14.0.0 and later versions should include this bug fix. 14.0.2 has been current for nearly 2 months.
The designation of 14.0 as 2013 SP1 seems designed to introduce a new level of confusion.

登陆并发表评论。