User Guide

  • 12/15/2019
  • Public Content

Controlling Amount of Collected Data

Application Performance Snapshot (APS) provides several methods to control the amount of collected data. This enables you to reduce profiling overhead and focus on relevant application sections.

Collection Control API

By default, APS collects statistics for the whole application run. In some cases, it is important to enable or disable the collection for a specific application phase. For example, you may want to focus on the most time consuming section or disable collection for the initialization or finalization phases. APS provides APIs to control data collection from source code.
For MPI applications, use the
API. Call
to pause data collection, call
to resume it again. For more information, refer to Region Control with MPI_Pcontrol .
For non-MPI applications, ITT API is also available. Before using ITT API, you need to configure your system. For instructions, refer to: . If APS is installed as a standalone package,  
is equal to
. After the system is configured, you can use ITT API. Call
to pause and resume data collection, respectively.
By default, profiling is enabled when the application is launched. To launch the application without profiling, use the
option. Profiling will begin automatically with the first call of
. This can be useful to skip the initialization phase.

MPI Imbalance Collection

By default, APS collects and reports on the MPI imbalance (idle time). The
environment variable allows for additional control over how the imbalance is calculated. The default level changes based on the setting of the
environment variable. To change the level, update the
environment variable. For example:
Default value if
Turns off the imbalance calculation. Disabling the imbalance calculation reduces the overhead of APS, but does not provide information about MPI imbalance, which is an important statistic as part of application performance analysis.
For the Intel® MPI Library, imbalance (Idle time) is reported at this level.
Default value if
or higher.
Imbalance is calculated by calling
before any collective operation and measuring the time of the call. This can provide data about application imbalance. For example, when some ranks do their computation work faster than others, they need to wait for other ranks to start the MPI collective operations. The wait time can be calculated using the

Filter Data by Type

APS allows you to filter statistics collection by type: MPI statistics, OpenMP* statistics, or hardware counters statistics. By default, data of all types is collected.
To specify data collection types, use the
) option. As an argument, specify a comma-separated list of values
, or
to enable statistics collection of the specified types. Use the
argument to enable statistics collection of all types (default).
For example, to disable hardware counters statistics in an MPI application:
mpirun -n 2 aps -c mpi,omp ./myapp

Set MPI Level of Detail

For MPI applications, APS offers a multi-level approach to collecting statistics. There are five levels of detail that vary by the amount of data collected. By default, level 1 is enabled. To change the level, use the
environment variable. For example:
This table summarizes available levels of detail.
Information is collected about
1 (default)
MPI functions and their times
MPI functions and amount of transmitted data
MPI functions, communicators, and message sizes
MPI functions, communicators, communication directions and aggregated traffic for each direction
MPI functions, communicators, message sizes, and communication directions
Level 5 may provide too much information if an application uses a lot of communicators. In this case, consider reducing the statistics level. Also, some diagrams may be unavailable for statistics levels 1–4, depending on the availability of the information provided at that level.
value impacts the default value of the
environment variable. For more information, see MPI Imbalance Collection .

Collect Internal IDs of Communicators

With versions of APS as well as Intel MPI that are 2019 Update 4 or newer, you can use APS to collect internal IDs of communicators when you maintain the same number of nodes and processes per node between runs. In this case, the internal IDs do not change. To enable this function:
  • Set the APS_COLLECT_COMM_IDS environment variable to 1.
  • Set MPS_STAT_LEVEL to 3 or higher. These are the only levels where APS collects information about communicators.
    export MPS_STAT_LEVEL=3

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804