Intel® Moderncode for Parallel Architectures

Lock-free Java, or better scaling on multi-core systems

Everyone these days has to address multi-core issues, or vertical scaling, at least on the server-side of things. And there does not seem to be a general approach, so we end up re-architecting our applications every time we add cores. At the same time, the availability of many-core processors seems to be constrained by the lack of a reasonable software technology to make good use of them.

igzip for VS10 C++?

I was searching for a zlib-compatible compressor but faster, and came cross the paper describing igzip --

High Performance DEFLATE Compression on Intel Architecture Processors

igzip looks like exactly (!) what I am looking for.  Compatible with zlib, but faster.

However, the downloadable source was for Linux.  I need it for a VS10 C++ project.  I have successfully (I think) compiled and assembled the desired modules (common, crc, crc_utils, hufftables, hufftables_c.cpp, igzip0c_body, igzip0c_finish, init_stream) into a .lib. 

OpenCL vs Intel Cilk Plus Issues, Differences and Capabilities

I  am curious as to the differences between OpenCL and Intel Cilk Plus. They are both parallel programming paradigms that are receiving wide recognition but technically speaking is one better than the other or are they simply different. Also what yardstick do I use when choosing between the two when solving an embarrassingly parallel problem. Please i need answers.



Thread complexion(Multi-threading)

Hello everyone,

                           On the other day was trying to create a thread which could capture the working of an already existing(working) thread and copy its working. Setting priority of threads so that they can capture the working of the same priority level threads and also dynamic increase in the thread capacity to handle similar kind of work.

would appreciate if anybody could help with it.



The list of out-of-order CPUs


I would like to know the list of commercial products ( CPUs / SoCs ) made by Intel that support an out-of-order execution .

I noticed that the new Baytrail architecture apparently should support this kind of execution, but I have no information about other architectures, about Xeon, iCore, previous Atoms, Celerons and Pentiums; at this point I also have no specific information about the subsets of a given family, for example Baytrail is usually shifted into Baytrail-M and Baytrail-T and I can only speculate that this new out-of-order applies to both .

Linking against both the sequential and threaded mkl

I have two dlls that link against the static mkl libraries.  One of the dlls links against the sequential version and the other against the multi-threaded version.  Those two dlls are then loaded in to the same process.  Does anybody know whether this is safe to do please?

Kind regards


How to track down OpenMP segfault caused by the addition of ORDERED?

Dear all,

I hope this is the right place to ask this question.

I am working on adding OpenMP support to some existing Fortran code, using ifort version 15.

I noticed that the addition of the c$OMP ORDERED clause to my outer parallel do loop causes the program to segfault in the second loop iteration, when attempting to access a FIRSTPRIVATE variable.  This occurs with OMP_NUM_THREADS=1.  The same error also occurs with ifort 14.0.2.

'Wildhoney' - the 512bit superfast textual decompressor - some thoughts

Hi to all.

Glad I am that finally joined the Intel forum, long overdue.
Here I want to share my amateurish vision on superfast textual decompression topic.

For 4 months now I have been playing with my file-to-file decompressor named Nakamichi.
I am on quest for writing the fastest possible variant of my approach, branchlessness combined with one only native (hifhest order) register on latest machines.
This translates to 64bit/512bit mixed code.
Few hours ago I wrote 'Wildhoney' variant using just that configuration.

Suscribirse a Intel® Moderncode for Parallel Architectures