No Cost Options for Intel Math Kernel Library (MKL), Support Yourself, Royalty-Free

Here is a guide to various ways to obtain the latest version of the Intel® Math Kernel Library (Intel® MKL) for free without access to Intel® Premier Support (get support by posting to the Intel Math Kernel Library forum). Anytime you want, the full suite of tools (Intel® Parallel Studio XE) with Intel® Premier Support and access to previous library versions can be purchased worldwide.

Diagnostic 3180: unrecognized OpenMP #pragma

The test code bellow worked with Intel Composer 2013 but not with SP1 Update 5,
and gives me "Diagnostic 3180: unrecognized OpenMP #pragma".
Thanks in advance.

// OpenMPTest.cpp : Defines the entry point for the console application.

#include "stdafx.h"
#include <map>
#include <omp.h>

int _tmain(int argc, _TCHAR* argv[])
	std::map<int, int> box;
	box[0] = 0;
	box[1] = 0;
	box[2] = 0;

#pragma omp parallel for
	for (auto iter = box.begin(); iter != box.end(); ++iter) {
		(*iter).second = rand();
	return 0;


No Cost Options for Intel Parallel Studio XE, Support Yourself, Royalty-Free

Intel® Parallel Studio XE is a very popular product from Intel that includes the Intel® Compilers, Intel® Performance Libraries, tools for analysis, debugging and tuning, tools for MPI and the Intel® MPI Library. Did you know that some of these are available for free? Here is a guide to “what is available free” from the Intel Parallel Studio XE suites.

Intel® Parallel Studio XE 2016: High Performance for HPC Applications and Big Data Analytics

Intel® Parallel Studio XE 2016, launched on August 25, 2015, is the latest installment in our developer toolkit for high performance computing (HPC) and technical computing applications. This suite of compilers, libraries, debugging facilities, and analysis tools, targets Intel® architecture, including support for the latest Intel® Xeon® processors (codenamed Skylake) and Intel® Xeon Phi™ processors (codenamed Knights Landing). Intel® Parallel Studio XE 2016 helps software developers design, build, verify and tune code in Fortran, C++, C, and Java.

OpenMP slower than serial codes


I am trying to do the parallelization of a serial preconditioned conjugate gradient solver codes for 3D fire simulation using OpenMP (Intel compiler). But the performance seems not to be improved.

The grid dimension is 79x81x79 and the solver can converge after 565 iterations. The serial codes cost 3.39 seconds and the OpenMP version needs 3.86 seconds on Intel i7 2600.

Please help me to check  the codes. Thanks a lot.

An example to quickly solve performance issue in OpenMP* program by using VTune Amplifier’s results

If you compile and run your OpenMP* code with Intel Compiler 13.1 Update 2 or later, use advanced-hotspots from VTune(TM) Amplifier XE 2015 Update 4 to get important metrics, they can be categorized into "Serial Time" and "Parallel Region Time". Also “OpenMP Potential Gain” is provided to let you know if you have more works to optimize code. Meanwhile, VTune Amplifier highlight on:
Assine o OpenMP*