Code optimization problem with icl 16.0

I have come across what I think is a code optimization problem with icl version 16.0. I'm running on Windows using this:

  Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64,
  Version Build 20150815

Here's my test program:

Installation command line argument for --components

I want to install parallel studio 2016 with parallel_studio_xe_2016_setup.exe with command line.

But i dont want to install fortran component. I only want to install Intel® Parallel Studio XE 2016 Composer Edition for C++ and Intel® Parallel Studio XE 2016 Cluster Edition.


What is the argument to pass for the option --components=?


Best Regards

Branch Monitoring

Hi, I'm Marcus, I'm doing an academic research on branch monitoring using Branch Trace Store capabilities. Currently, i'm trying to implement a BTS monitor on Windows 7/8. I'm facing some challenges on such development and so I ask for any help.
I made a question on performance forum, but anyone was able to answer me, so I feel free to ask here, since Vtune uses BTS monitoring on Windows.

mkl_zcsrcoo faster computation on subsequent calls?


I have a sparse matrix in coordinate format (row, col, A) and I transform it to CSR to be used for PARDISO. The sparsity pattern never changes (that is, row and col are always the same). Vector A changes from time to time. As I understand, I can run with job(6)=1 to get only ia. Is this any faster? What does job(6)=2 do?

I put here the documentation for job(6). Thanks!

For conversion to the CSR format:

If job(6)=0, all arrays acsr, ja, ia are filled in for the output storage.

undefined symbol in running not in linking of executable linked with a static library (offload model)


I am trying to run an offload application on Xeon Phi, and I get the following error.

"On the sink, dlopen() returned NULL. The result of dlerror() is "/var/volatile/tmp/coi_procs/1/205497/load_lib/iccouthqsEw3: undefined symbol: DISTMEM_rank"
On the remote process, dlopen() failed. The error message sent back from the sink is /var/volatile/tmp/coi_procs/1/205497/load_lib/iccouthqsEw3: undefined symbol: DISTMEM_rank
offload error: cannot load library to the device 0 (error code 20)"

Простая методика оптимизации с использованием Intel System Studio (VTune, компилятор C++, Cilk Plus)


В этой статье мы описываем простую методику оптимизации с использованием Intel® Cilk™ Plus и компилятора Intel® C++ на основе результатов анализа производительности, проведенного с помощью Intel® VTune Amplifier. Intel® System Studio 2015 содержит упомянутые компоненты, использованные для этой статьи.

  • Разработчики
  • Партнеры
  • Профессорский состав
  • Студенты
  • Microsoft Windows* 8.x
  • Коммерческие клиентские решения
  • Cloud Services
  • Разработка игр
  • Интернет вещей
  • Windows*
  • C/C++
  • Продвинутый
  • Начинающий
  • Средний
  • Intel® System Studio
  • Intel System Studio
  • intel cilk plus
  • Intel VTune Amplifier for systems
  • C++ Compiler Windows Host
  • Intel® Advanced Vector Extensions
  • Intel® Streaming SIMD Extensions
  • Образовательные учреждения
  • Инструменты для разработки
  • Процессоры Intel® Core™
  • Оптимизация
  • Параллельные вычисления
  • Анализ платформы
  • Многопоточность
  • Векторизация
  • Cross thread heap allocation/deallocation


    We are using intel inspector XE 2016 on a game made on Unreal Engine 4 to find memory leaks. However it seems to report a lot of false-positive pairs of "missing allocation" and "memory leak". We suppose that all allocations in one thread but free'd in another thread are causing such false positive reports. Is there a solution/option to tell intel inspector that such reports are linked? We can't find our real memory leak as we are flooded in a lot of those reports.


    Assertion failed: thread_manager_impl485: (blocked == tpss_tls_op_err_ok): BUG!

    Hi,I use VTUNE Amplifier XE 2015 to get performance data of database Postgres on a remote linux server and use Benchmark Factory 7.0 to send SQL queries to the server, but fail to finish the test.

    The following picture is the error VTUNE reports. So, I am wondering that does the statement "blocked == tpss_tls_op_err_ok" belong to VTUNE or BMF? And is VTUNE not compatible with BMF? How can I get over the problem?

    Thanks for your help in advance.


    Подписаться на Многопоточность