Parallel Computing

using Intel ODE solver when the variables are vectors


I am trying to use the intel ODE solver dodesol to solver a set of ODEs. I have run the example and it runs fine. The problem is the example only shows when the variables are scalars. For example:

Y1' = f1(Y1,t)

Y2' = f2(Y2,t)

The example shows how to use it when Y1 and Y2 are two numbers. What if each of them are 1D, 2D or 3D arrays? Is there an example for that?




Download a Very Old Install Package (11.1)

I'd like to download and test a build of software we have for a legacy Itanium machine.  For this package, we require the Intel 11.1.064 compilers.  If I go to the downloads page, I can get the 11.1.080 install package but can't find other patch versions of 11.1.

Is it still possible to download and install the 11.1.064 compiler suite?  (We have a valid license, etc.)


Significant performance improvment of symmetric eigensolvers and SVD in Intel MKL 11.2

Intel MKL 11.2 contains a number of optimizations for Symmetric Eigensolvers and SVD. These are mostly related to large matrices N>4000, 6000, and on, but speedups are significant comparing to the previous MKL 11.1.  SVD brings up to 6 times (or even higher on large thread counts and matrix sizes), similarly for eigensolvers, several times could be observed.  More details can be found in this KB article.

Having trouble running Hot Spot Analysis on a program


I am currently evaluating Intel VTune Amplifier XE 2015, but I am having trouble running a Basic Hotspot analysis.  When executed from Visual Studio, the app starts up.   Takes a while to load, then just crashes.   VTune then shows the following 

"Failed to write probes in process, can't complete attach"

After some debugging, the following API call seems to be the cause of this. -> ConvertStringSecurityDescriptorToSecurityDescriptor
Without VTune analyzing the code, this runs without any problems.

Any ideas?

Serial vs Parallel with hashmap & pipeline: discrepancy

Dear all,

I'm porting another simple I/O intensive piece of code to TBB, but my two versions differ hugely in their results. The serial version uses a unordered_map of strings to ints, and the parallel one accordingly uses a concurrent_hash_map. The pipeline reads strings, and counts them concurrently, as you will see, making use of std::atomic.

The serial code is as follows:

ippiFilterBox accesses data beyond input buffer

I just recently noticed, that ippiFilterBox_32f_C1R can access data beyond the input buffers limits (including external border). Usually this does not cause much of a problem, but of course sometimes this can lead to an access violation. E.g. Microsofts Application Verifier can detect the issue reliably. I can reproduce with IPP 7.1 and 8.0.

An excerpt from the code triggering the problem:

Cluster version of PARDISO is available as part of the latest Intel MKL 11.2

Cluster version of PARDISO is available as part of the latest Intel MKL 11.2

The main features in Direct sparse solvers for Clusters functionality are:

  • Distributed csr format, support distributed matrix, rhs and/or distributed solution
  • Solving of system with multiple right hand side
  • Cluster support of factorization and solving steps
  • C and Fortran examples

Please find more details here.

FEAST Eigenvalue solver using MKL-Pardiso



I have just started to test FEAST Eigenvalue solver for my nanodevice simulations. I have a question since it uses MKL-Pardiso as a Inner linear solver and it would be nice if some could give some insight on the issue I am facing.


I have been using in particular the FEAST_Sparse interface that requires MKL-Pardiso. My matrix is in CRS format so which makes the use of predefined driver interfaces within FEAST solver  package easy. 


Parallel Computing abonnieren