Many of today’s HPC applications use Intel® MPI to implement their parallelism. However, using Intel’s analyzer tools in a multi-process environment can be tricky. Intel® Advisor can be very helpful to maximize your vectorization, memory and threading performance. You can also use the Intel Advisor Roofline chart to visualize your performance bottlenecks. To analyze Intel MPI applications using Intel Advisor you should follow these steps to get the best value out of your results.
Remote analysis flow
First, collect data using the following command on the target:
Intel® Cluster Checker verifies the configuration and performance of Linux based clusters and checks compliance with the Intel® Scalable System Framework architecture specification. If issues are found, Intel® Cluster Checker diagnoses the problems and may provide recommendations on how to repair the cluster.
Intel® Cluster Checker has the following features:
Since HPC applications target high performance, users are interested in analyzing the runtime performance of such applications. In order to get a representative picture of that performance / behavior, it can be important to gather analysis data at the same scale as regular production runs. Doing so however, would imply that shared memory- focused analysis types would be done on each individual node of the run in parallel. This might not be in the user’s best interest, especially since the behavior of a well-balanced MPI application should be very similar across all nodes.
High-performance computing is changing fast, with trends like machine learning and next-generation hardware like the Intel® Xeon Phi™ processor. To help developers maximize the possibilities, Intel® Parallel Studio XE 2017 delivers a host of new capabilities to support important trends like machine learning.