Experimental feature: Viewing load imbalance in OpenMP* applications with Intel® VTune™ Amplifier XE
With Intel® VTune™ Amplifier XE 2013 Update 12 and earlier it was possible to profile OpenMP applications with parallel regions as described in the article
This article describes a parallel merge sort code, and why it is more scalable than parallel quicksort or parallel samplesort. The code relies on the C++11 “move” semantics.
This article provides instructions for code access, build, and run directions for the miniGhost code, running on Intel® Xeon® processors and Intel® Xeon Phi™ Coprocessors.
The SIMD and multi-core features of modern processors enable large improvements in application performance―but only if the application is effectively optimized for parallel execution.