P-Gadget3 is a computational astrophysics tool. Scientists use it to simulate self-gravitating systems with added complex gas physics. The problems that this code helps to solve include the formation of cosmological large-scale structures, clusters and galaxies, star formation, and metal enrichment. To model the physics of gas, P-Gadget3 uses smooth particle hydrodynamics (SPH). This mesh-free computational method approximates parcels of gas or fluid as particles. The code scales to hundreds of thousands of cores and has a considerable user base.
Recently we learned about the performance optimization work on the SPH solver in P-Gadget 3. It was carried out by Dr. Fabio Baruffa, a Senior HPC Application Specialist at the Leibniz Supercomputing Centre.
Dr. Baruffa shared his methods of work:
- the isolation of a kernel code with serialization
- the usage of Intel® VTune™ to spot bottlenecks, and
- the principle of minimally invasive approach.
He also demonstrated performance optimization techniques used in this project, such as
- transformation to lockless loops and
- improved vectorization with the help of runtime conversion of an array of structures (AoS) to a structure of arrays (SoA).
The result of the optimization efforts was a tremendous performance gain. On Intel® Xeon® processors, the optimized SPH kernel works 2.6-4.7x faster. On 68-core Intel® Xeon Phi™ processors, the speedup is 20x.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804