@Patrick The article for OpenMP was published.
Threading on Intel® Parallel Architectures
Any solution for “dynamic scheduling” in MKL ??
I wonder if there is some way to make MKL routine working as #pragma omp for schedule(dynamic) in OpenMP.
as I'm using Sparse BLAS doing SpMV, and for some reason I deside to reorder the rows of a sparse matrix by length. This results in the workload for internal threads in MKL being unbalanced, since SpMV in MKL is parallelized by row, and probably static scheduled.
if there's some function controlling the schedule strategy of internal threads that i don't know, it would be great.
Processor performance with hyperthreading
Can anybody help me to understand the following situation:
Looking for ways to detect Hyper Threading
Hi,
I am loking for ways to detect hyper threading in c++ program on Linux. I don't intent to read /proc/cpuinfo file.
I have tried cpuid and then edx value, somehow, it returns the same value on machines with HT and without HT.
It seems some cpucounter program on web is not working for intel64.
I am using 64 bit Linux machine with :
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz
...
Thank you.
Distributed Reader-Writer Mutex 1.0
Hello,
Distributed Reader-Writer Mutex 1.0
Description:
Parallel sorting algorithms
Hello,
look down the the following link...
it's about parallel partition...
About Adaptive Mode for L1 Cache in Hyper-threading
Dear all:
I'm a student doing some research on Hyper-threading recently. I'm a little confused about the feature - L1 Data Cache Context Mode.
In the architecture optimization manual, http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia...
It was described that L1 cache can operate in two modes:
The first level cache can operate in two modes depending on a context-ID bit:
Compiling OpenMP Fortran programs
Suppose I have a FORTRAN file that doesn't have any OpenMP directives, but as a routine that will be called from within OpenMP parallel region.
If is there any difference in compiling the following file with one of these options:
-openmp
-recursive
-auto
If there is a difference, is there any downside to using these in combination (it simplifies some Makefiles).
Thank you
--rr
cuda programming from intel c/c++
i know how to put the code to use the gpu capabilities, but still i don't figure how to call the cuda compiler from the intel c/c++ compiler, my problem is this, once i have the program, it's well or wrong coded, how to compile it?
i'm using intel c/c++ composer studio under windows 7 on a cuda cappable computer of corse
Performance of Multi-threaded Applications
I have an multi-threaded application in which runs 20% slower on my MacBook Pro with two threads than one. I checked for blocking conditions and found that this is not the problem. The application is huge and accesses a huge in memory database so the cache doesn't have that much effect on performance. So I figure the problem is that this machine does not have enough memory bandwidth to support two threads that access a lot of memory.
