Introducing Application Performance Snapshot

Use Application Performance Snapshot for a quick view into a shared memory or the use of available hardware (CPU, FPU, and memory) by an MPI application. Application Performance Snapshot analyzes CPU and FPU usage by your application, I/O and memory footprint, memory access stalls, and MPI and OpenMP* utilization. After analysis, it displays basic performance enhancement opportunities for systems using Intel® platforms. Use this tool as a first step in application performance analysis to get a simple snapshot of key optimization areas.

You can download Application Performance Snapshot for free from the Intel® Developer Zone at https://software.intel.com/performance-snapshot. The tool is also available pre-installed as part of Intel® Parallel Studio or Intel® VTune™ Amplifier.

Note

Starting with the 2018 Beta release, the updated Application Performance Snapshot for Linux* OS includes most of the functionality previously available in the MPI Performance Snapshot. MPI Performance Snapshot is no longer available as a separate tool.

What's New

This User's Guide documents Application Performance Snapshot for Linux* OS.

This is a change log for the current and previous product releases:

Application Performance Snapshot 2019 Update 4

  • Ability to collect internal IDs of communicators provided by Intel MPI. This feature is supported for versions of Application Performance Snapshot as well as Intel MPI that are 2019 Update 4 or newer.

Application Performance Snapshot 2019 Update 3

  • Ability to generate HTML-based rank-to-rank communication diagram by message volume to better visualize MPI application communication patterns.

Application Performance Snapshot 2019 Update 2

  • Full-featured OpenMPI* support
  • Improved vectorization efficiency metrics
  • MPI Imbalance time is no longer calculated on the default stat level 1 to minimize collection overhead on that level
  • aps-report: added option to display statistics only for the selected set of MPI functions
  • MPI collector general optimizations

Application Performance Snapshot 2019 Update 1

  • MPI Imbalance collection extended with a mode that enables measuring pure application imbalance. This mode is applicable to MPI implementations binary compatible with the MPICH. If required, you can switch off the imbalance collection to minimize collection overhead.

  • MPI tracing overhead improvements with a noticeable impact on cases with a large number of ranks.

Application Performance Snapshot 2019

  • Intel® Omni-Path Architecture Interconnect Bandwidth and Packet rate metrics added to explore MPI communication bottlenecks.

  • Added an HTML-based rank-to-rank communication diagram to better visualize MPI application communication patterns.

Application Performance Snapshot 2018 Update 3 and 2019 Beta Update

  • The aps-report utility added the --format option, which allows the report to be generated in either text (*.txt) or comma-separated (*.csv) format. The CSV format can be useful for report processing automation or export to spreadsheet programs such as Microsoft Excel*.

  • The Rank-to-Rank data transfers report was enriched with an aggregated communication time column.

  • MPI trace file size was compacted with compression and minimal statistic level set by default. Some reports generated by the aps-report utility will be inapplicable with minimal statistic level. See Controlling Amount of Collected Data for more information.

  • Report generation time with the aps-report utility was significantly improved.

Application Performance Snapshot 2018 Update 2

Application Performance Snapshot 2018 Update 1

  • Removed restrictions for MPI_Pcontrol region numbers.

Application Performance Snapshot 2018

  • The tool is now invoked as aps rather than aps.sh.

  • Result directory change from stat_* to aps_result_*.

Application Performance Snapshot 2018 Beta

  • Initial release.
For more complete information about compiler optimizations, see our Optimization Notice.