Intel® Moderncode for Parallel Architectures

Strange crashes of OMP-parallelized FFT code on Itanium

A small bit of background:

I'm a relative novice to OpenMP - I decided on it because my literature

survey seemd to indicate that it required far less code intervention

than e.g. direct Posix thread coding would (though it seems OpenMP is

basically designed to be a user-friendly macro-ization of Posix threads.)

I have a large-integer-arithmetic C code that I'm currently trying

to parallelize. The key operation is a big-int multiply algorithm that

uses a double-precision FFT to effect the multiply - we're talking

OpenMP guidance


I am developing a neural network package using Intel C++ compiler 9.0. The code is so parallel that it is a no brainer to use OpenMP. The problem is to know when not to use it.

Most of my code is vector operations (dot product, vector add, scaling etc.). What I am looking for is some guidance as to when it becomes detrimental to parallelize - for example, it is probably worth parallelizing A.B if dimensionality of vectors A & B is 10^6. But should I parallelize such loops when I expect the typical dimensionality to be 100 or 1000 or 10000?

32 bit chip and memory

For intel 32 bit chip (CPU), the max. memory it can allocate is 4 GB.

1) Does this mean one/single 32 bit CPU restricted to 4 GB?

2) If YES to 1), can I say thatfor a server withtwo 32 bit CPUs, the max. memory can be allocated will become 2 x 4 GB (or 8 GB)?

3) If still aYES to 2), can I use a formula

Max. memory (GB) = Number of CPU x 4 GB

Thanks to help.


I have been experimenting with KMP_SET_BLOCKTIME

In the process I noticed something a little discouraging. The intention of the block time, when block time is not zero, is to keep the threads that have finished working (in a section) running while the remainder of the threads complete the section. The purpose being to avoid an operating system context switch on each/some/all the threads before you enter the next parallel section.

Suscribirse a Intel® Moderncode for Parallel Architectures