Threading on Intel® Parallel Architectures

OpenMP Block gives false results

Hi all,

I would appreciate your point of view where I might did wrong using OpenMP.  I parallelized this code pretty straight forward - yet even with single thread (i.e., call omp_set_num_threads(1)) I get wrong results.

I have checked with Intel Inspector, and I do not have a race condition, yet the Inspector tool indicated (as a warning) that a thread might approach other thread stack (I have this warning in other code I have, and it runs well with OpenMP). I'm pretty sure this is not relate to the problem.

Thanks, Jack.

Slowdown with OpenMP

I'm getting some pretty unusual results from using OpenMP on a fractional differential equations code written in fortran. No matter where I use OpenMP in the code, whether it be on an intilization loop or on a computational loop, I get a slowdown across the entire code. I can put OpenMP in one loop and it will slow down an unrelated one (timed seperately)! The code is a bit unusual, as it initalizes arrays starting at 0 (and some even negative). For example,

web crawling through "Intel Xeon Phi Coprocessors"

I am new to this forum. I want to implement parallel crawling on "Intel Xeon Phi Coprocessors" as for my project. Before buying equipment, installing software and start learning about this platform I want to know that whether it is possible to somehow connect to Network and get web URLs in parallel using this technology? (I don't want to create cluster of CPUs to do. I want to do it using single card).

Intel MPI for Phi tuning tips?

Does setting


change other MPI environment variables, particularly any that would tune MPI for the MIC system architecture?  

As a side question, has anyone written a Tuning and Tweaking guide for IMPI for Phi?  For example, what I_MPI variables could one use to help tune an app targeting 480 ranks across 8 Phis?



Lock-free Java, or better scaling on multi-core systems

Everyone these days has to address multi-core issues, or vertical scaling, at least on the server-side of things. And there does not seem to be a general approach, so we end up re-architecting our applications every time we add cores. At the same time, the availability of many-core processors seems to be constrained by the lack of a reasonable software technology to make good use of them.

igzip for VS10 C++?

I was searching for a zlib-compatible compressor but faster, and came cross the paper describing igzip --

High Performance DEFLATE Compression on Intel Architecture Processors

igzip looks like exactly (!) what I am looking for.  Compatible with zlib, but faster.

However, the downloadable source was for Linux.  I need it for a VS10 C++ project.  I have successfully (I think) compiled and assembled the desired modules (common, crc, crc_utils, hufftables, hufftables_c.cpp, igzip0c_body, igzip0c_finish, init_stream) into a .lib. 

OpenCL vs Intel Cilk Plus Issues, Differences and Capabilities

I  am curious as to the differences between OpenCL and Intel Cilk Plus. They are both parallel programming paradigms that are receiving wide recognition but technically speaking is one better than the other or are they simply different. Also what yardstick do I use when choosing between the two when solving an embarrassingly parallel problem. Please i need answers.



Suscribirse a Threading on Intel® Parallel Architectures