Intel® C++ Compiler

ICC 16.0.2 fails at runtime when allocating array of std::shared_ptr objects

The Windows ICC 16.0.2 compiler (using VS2015 update 1) is failing at run time when allocating an array of shared_ptr.  The error is an Access Violation.  Meanwhile, compiling with just the msvc compiler worked.

Stepping through the code didn’t show anything obvious.  So I tried converting the raw array of shared_ptr  (*shared_ptr<T>) to a std::vector of shared_ptr (std::vector<std::shared_ptr<T>>.  I was able to move forward in execution.  But I ended up crashing later due to the same issue in another piece of code.

Linking Intel OpenMP library with GCC

Hi guys,
I know this is a very common and known issue, but I think that this forum has to have a specific thread on this in order to have an official answer.

I'm compiling with GCC and I'm using Intel MKL (that calls Intel OpenMP library).
Result: BOTH libgomp (from GCC) and libiomp5 (from Intel) are linked.
I would avoid this double-linking, but how?

On Intel User Guide I read that I should use:

how to broadcast 4 float into 4 lanes?

Hi there,

After reading a large of materials, I can never fount out how to broadcast 4 float variables into 4 lanes of the vector register on MIC.

e.g. float array[4]={a,b,c,d};

how to load into a vector register like :{aaaa,bbbb,cccc,dddd} using one intrinsic.

If I use _mm512_mask_blend_ps, it takes 4 intrinsics.

Intel C++ Compiler 16.0.2 fails to compile C++11 code (using GCC 5.3.0 headers)

Hello,

I can't compile the following code:

#include <unordered_map>
#include <utility>
#include <tuple>

int main(int argc, char** argv) {
  std::unordered_map<int, int> map;
  int a1 = 0;
  int a2 = 1;
  map.emplace(std::piecewise_construct, std::make_tuple(a1), std::make_tuple(a2));
  return 0;
}

using the Intel C++ Compiler version 16.0.2 (from Parallel Studio XE 2016.2.181), when using the libstdc++ headers which ship with GCC 5.3.0. I'm compiling like this:

icpc -std=c++11 -o bla bla.cc

Access violation or stack overflow during compilation.

I have switched to icc version 16.0.2 (gcc version 5.0.0 compatibility), and now, I cannot compile one of one of my c+++ codes.

Compiler complains:

-------------------------------

[100%] Building CXX object CMakeFiles/test3.dir/maintest3.cpp.o
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

compilation aborted for /homes/doua/tdumont/DGSage/Essai/maintest3.cpp (code 4)

/fp:strict flushes subnormals to zero

I have tested the C+ compiler 2016 Update 2.

As it turned out, it flushes subnormal numbers to zero if /fp:strict model is chosen (automatically). Even though "Flush Denormal Results to Zero " is set to "No". 

Is it intended behavior?  (Previous versions was working differently, if I am correct).

 

 

Multiple code paths and intrinsics

I have some code where I'm using a combination of automatically vectorized code (with many different possible CPU paths, including SSE2, SSE4.2, AVX and AVX2) and some hand-written intrinsics.

One function may contain both types of loops. So, what I would like is to be able to tell the compiler which of the hand-written types of code to use (they are written using SSE2 and AVX2). But I really don't want to have to write separate dispatcher functions for each loop - including having to make all the variables that are needed available in the dispatched function.

Good practices and design choices for intrinsics

Hi,

I think I have a good background on how a cpu and memory work; I know the usual stuff about CPUs, especially Intel CPUs with a cache line that is usually 64 bytes, each CPU core having dedicated SSE and/or AVX registers, and so on. I'm also fairly familiar with the common practices that are used to increment performances, improve memory usage, avoid cache spills, and all the modern scenarios that make a good software for concurrency and parallelism in various form ( mainly multicore and SIMD ) .

 

error #10037 unable to find 'ar.exe'

Hello,

I am trying to compile the GFX_Samples project @ (<install-dir>/samples_2016/<locale>/compiler_c/psxe/gfx_samples).  Using Visual Studio 2013, Intel Parallel Studio EX 2016 Update 2 Cluster Edition.  I've installed every option and I am still seeing this error message on every project in the sample package...

Wrong (or not) linking command line for xilibtool executing libtool with XCode on Mac

Dear Intel Compiler Experts,

An issue making a project impossible to link on certain machine setup is blocking us at the moment on Mac OS.

To explain the context, the project is a dynamic library composed of one static library compiled by ICC and other files compiled and linked (with this static lib) by the Apple Compiler. On certain computers, the compilation is fully working with XCode 5.1.1 and ICC15, but on some others not. The exact same issue happens also with XCode 7.2 and ICC16.

The issue is the following one:

xilibtool: executing 'libtool'

Intel® C++ Compiler abonnieren