Intel® C++ Compiler

Intel® Parallel Studio XE 2016 Beta program has begun

Hello everyone,

The Intel® Parallel Studio XE 2016 Beta program has just begun. We welcome you to participate. For the new features and improvements, you can find them under "Change History" section in the Intel C++ Compiler Release Notes:

Free webinar April 7 2015 9am PST "Further Vectorization Features of the Intel Compiler"

There is a free webinar “Further Vectorization Features of the Intel Compiler” coming next Tuesday talking specifically about getting more vectorizations from Intel Compilers. But you would benefit it more if you've watched/listened to the previous webinar Performance essentials using OpenMP* 4.0 vectorization with C/C++.

Intel® Xeon Phi™ Coprocessor code named “Knights Landing” - Application Readiness

As part of the application readiness efforts for future Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors (code named Knights Landing), developers are interested in improving two key aspects of their workloads:

  1. Vectorization/code generation
  2. Thread parallelism

This article mainly talks about vectorization/code generation and lists some helpful tools and resources for thread parallelism.

  • Разработчики
  • Сервер
  • Средний
  • Intel® C++ Compiler
  • Intel® AVX-512
  • Knights Landing
  • Intel SDE
  • Intel® IMCI
  • Intel® Many Integrated Core Architecture
  • Параллельные вычисления
  • Векторизация
  • Оптимизировали, оптимизировали, да не выоптимизировали!

    Оптимизация? Конечно, каждый сталкивался с данной задачей при разработке своих, сколь-нибудь значительных, требующих определённых вычислений, приложений. При этом способов оптимизировать код существует огромное множество, и, как следствие, различных путей сделать это в автоматическом режиме с помощью опций компилятора. Вот здесь и возникает проблема – как выбрать то, что нужно нам и не запутаться?

    Which parallelization library to use for realtime processing?

    I'm developing a realtime audio processing software. There may be several (for example even 100) processors at each moment, in several parallel chains. I cannot let the processors cooperate and must assume any possible sequence of processing. Each of them receives a block of data usually 256-1024 values and needs to process them as quickly as possible, so that the results may be passed to the next item in chain. If the data is not delivered in time, bad things happen... But in many cases just a few processors may be used and the goal is to keep general CPU usage minimal then.

    Memory latency numbers


    I 've been working on a talk about cache effects. My main document is "What every programmer should know about memory". I've been working on small benchmark that shows the different level of caches : walking an array of objects in a linear/randomized order. The 3 levels of cache are quite obvious when walking in a randomized order, and I measure about one cache line every 300 clock cycles on a Xeon 2xxx once the program has to go to main memory.

    Compiler bug in XE 2015: error : no instance of function template "..." matches the argument list


    the following code:

    #include <tuple>
    struct Foo {
    	std::tuple<int> inner;
    	template <unsigned Idx>
    	auto get() const -> decltype(std::get<Idx>(inner)) { return std::get<Idx>(inner); }
    int main()
    	Foo f;

    produces the following error:

    debug symbol of binary appeare always w/o debug enable


    My ICC version intel_parallel_studio_xe_2015_update1, trial version.

    I used following command to compile, 

    icc  -w -fpermissive -fPIE -I. -DMKL_ILP64 -DLINUX  -std=c++11 -g0 -O3  -c xx.cpp -o /tmp/xx.o
    xiar rcs /tmp/xx.a /tmp/xx.o

    and use following command to link.

    CPU2006 compile issues with MSVC 2013, ICC XE 2015 rev 3, windows server 2012


    ICL Version


    I have compile errors with 2 cpu2006 benchmarks.


    483.xalancbmk dies in execution if I compile with -O3 -ipo; it works fine with -O2

    453.povray says:

    file defaultrenderfrontend.cpp

    error "<mathimf.h> is incompatible with system <math.h>!"


    proc_bind(spread) does not seem to be honored

    Hello Folks,

    I have a program that is decomposed in two parts:
    One loop that allocates data: it does 4 iterations, one for each socket
    One loop that does computation on the data, it does 48 iterations (each thread should work on a slice of data, hopefully a slice of data that is on the local socket).

    My machine is a 4 socket, 12 cores per processor Xeon machine. I'm using ICC 15.0.1 20141023

    Подписаться на Intel® C++ Compiler