Threading on Intel® Parallel Architectures

Any solution for “dynamic scheduling” in MKL ??

I wonder if there is some way to make MKL routine working as #pragma omp for schedule(dynamic) in OpenMP.

as I'm using Sparse BLAS doing SpMV, and for some reason I deside to reorder the rows of a sparse matrix by length. This results in the workload for internal threads in MKL being unbalanced, since SpMV in MKL is parallelized by row, and probably static scheduled.

if there's some function controlling the schedule strategy of internal threads that i don't know, it would be great.

Looking for ways to detect Hyper Threading

Hi,

I am loking for ways to detect hyper threading in c++ program on Linux. I don't intent to read /proc/cpuinfo file.

I have tried cpuid and then edx value, somehow, it returns the same value on machines with HT and without HT.

It seems some cpucounter program on web is not working for intel64. 

I am using 64 bit Linux machine with :

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz

...

Thank you.

About Adaptive Mode for L1 Cache in Hyper-threading

Dear all:

      I'm a student doing some research on Hyper-threading recently. I'm a little confused about the feature - L1 Data Cache Context Mode.

In the architecture optimization manual, http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia...

It was described that L1 cache can operate in two modes:

The first level cache can operate in two modes depending on a context-ID bit:

Compiling OpenMP Fortran programs

Suppose I have a FORTRAN file that doesn't have any OpenMP directives, but as a routine that will be called from within OpenMP parallel region.

If is there any difference in compiling the following file with one of these options:

-openmp

-recursive

-auto

If there is a difference, is there any downside to using these in combination (it simplifies some Makefiles).

Thank you

--rr

Performance of Multi-threaded Applications

I have an multi-threaded application in which runs 20% slower on my MacBook Pro with two threads than one. I checked for blocking conditions and found that this is not the problem. The application is huge and accesses a huge in memory database so the cache doesn't have that much effect on performance. So I figure the problem is that this machine does not have enough memory bandwidth to support two threads that access a lot of memory.

Páginas

Suscribirse a Threading on Intel® Parallel Architectures