Hi, I'm threading loops via OMP and PThreads.
I am looking at a published MPSC waitfree producer queue implementation that makes use of an atomic exchange. If I understand correctly on an intel platform this would be backed by a "lock xchg".
I am fairly fresh to this subject and trying to understand the implications. By claiming the producer is "wait-free" we are saying it has a bounded time frame in which it will complete. Is this possible with a "lock xchg"?
Chapter 12 of the most recent (June 2013) "Intel 64 and IA-32 Architectures Optimization Reference Manual" contains enabling and tuning recommendations for Intel(r) Transactional Synchronization Extensions in the 4th generation Intel(r) Core(tm) processor family.
I have written a function that incurs a tremendous amount of overhead in [OpenMP dispatcher] called by [OpenMP fork] called on behalf of a particular parallel region of mine, according to VTune. That fork accounts for roughly a third of all CPU time in my program. My code is as follows. My intention is to have two parfor loops running concurrently.
I am working on a project that involves manipulating the physical memory addresses assigned to a process in order to reduce cache line eviction on a third level (shared L3) cache. This essentially comes down to partitioning the l3 via software means. For this project I am using a single Xeon X3430 cpu with each process pinned to a separate core.
I've have errors using mpirun whitin any cpuset (but the /root)
The following mpirun is executed from a login shell which belongs to a cpuset manually created under the /root one:
mpirun -np 8 wam
/opt/intel/impi/4.1.0.024/intel64/bin/mpirun: line 390: 15578 Floating point exceptionmpiexec.hydra "$@" 0<&0
The same error happens whichever -np is, also without -np flags.
If the same command is executed when no cpusets are manually created (thus it will belong to the /root cpuset), then mpirun works like a charms.
I not sure is this is right place to ask questions like this, but i'll try.
I'm writing a code which enumerate cpu topology. I'm not sure do I fully understood Intel 64 Architecture Processor Topology Enumeration manual.
The x2APIC ID is divide into three bitfields(Package, core, logical procesor IDs). According to this manual to obtain this three sub IDs I must do as follows
This two-day webinar series introduces you to the world of multicore and manycore computing with Intel® Xeon processors and Intel® Xeon Phi™ coprocessors. Expert technical teams at Intel discuss development tools, programming models, vectorization, and execution models that will get your development efforts powered up to get the best out of your applications and platforms.
Register Now - space is limited!