I read the gcc document as well as intel compiler document, they both said that the default behavior will not detect the underlying microarchitecture in linux x86-64 (default is -xsse2). As a result, I need to put -march=nehalem in CXXFLAGS in g++ and put -xsse4.2 in CXXFLAGS in icpc. However, while linking with mkl, there is no flag to tell mkl that my microarchitecture is nehalem:
g++ -std=c++11 -O2 -march=nehalem -c main.cpp
g++ main.o -lmkl_rt
icpc -std=c++11 -xsse4.2 -c main.cpp
icpc -mkl main.o
So, how to guarantee mkl can take full advantage of nehalem and get the highest performance?
Thank you very much