Threading on Intel® Parallel Architectures

How to track down OpenMP segfault caused by the addition of ORDERED?

Dear all,

I hope this is the right place to ask this question.

I am working on adding OpenMP support to some existing Fortran code, using ifort version 15.

I noticed that the addition of the c$OMP ORDERED clause to my outer parallel do loop causes the program to segfault in the second loop iteration, when attempting to access a FIRSTPRIVATE variable.  This occurs with OMP_NUM_THREADS=1.  The same error also occurs with ifort 14.0.2.

'Wildhoney' - the 512bit superfast textual decompressor - some thoughts

Hi to all.

Glad I am that finally joined the Intel forum, long overdue.
Here I want to share my amateurish vision on superfast textual decompression topic.

For 4 months now I have been playing with my file-to-file decompressor named Nakamichi.
I am on quest for writing the fastest possible variant of my approach, branchlessness combined with one only native (hifhest order) register on latest machines.
This translates to 64bit/512bit mixed code.
Few hours ago I wrote 'Wildhoney' variant using just that configuration.

TBB: Using task_scheduler_observer to set worker thread's OS scheduling priority

I'm looking at TBB's task_arena and task_scheduler_observer.

The documentation for task_scheduler_observer sketches out a nice example of it being used to set thread affinity on worker threads to lock an arena's threads onto a subset of cores.

API for Haswells TSX


i have just begun my research focus with HTM, primarily focusing on RTM(restricted transaction memory). is there any APIs for RTM? I have looked on the internet but only the basic operands exist for RTM, such as xbegin, xend, xabort, xtest.

I want to be able to access the shared memories with HTM but i can not find any library files for it. 

Can you please point me in the right direction, thanks for your support.

CL_DEVICE_TYPE_CPU not working in Windows 8.1

I recently tried to run my OpenCL program on a new windows 8.1 computer but the program returns an error when the device type is CL_DEVICE_TYPE_CPU. When I change the device type to a CL_DEVICE_TYPE_GPU or CL_DEVICE_TYPE_ ALL it ran the program on the GPU.

Here is the system specification of the new computer:
OS: Windows 8.1
Processor: Intel Core i7 - 4700MQ clocked at 2.40GHz
Display Adapter: Intel HD Graphic 4600 and NVIDIA GeForce GT 740M

How can I resolve this problem and is OpenCL having issues with windows 8.1? Please help!


Lunching several MPI processes on multicore nodes

Hi everyone,

I have a simple issue, which must have a solution. Is it possible to assign several MPI processes to several nodes, such that first MPI process occupies full node, whereas other MPI processes are distributed on cores of the other nodes?

I have an example below:

On a cluster with 4 cores per node, to assign 2 MPI process to 2 nodes I do the following:

#PBS -l nodes=2:ppn=4

mpirun -pernode -np 2 ./hybprog

Iscriversi a Threading on Intel® Parallel Architectures