Initialization Termination and Control

Intel® Trace Collector is automatically initialized within the execution of the MPI_Init() routine. During the execution of the MPI_Finalize() routine, the trace data collected in memory or in temporary files is consolidated and written into the permanent trace file(s), and Intel® Trace Collector is terminated. Thus, it is an error to call Intel® Trace Collector API functions before MPI_Init() has been executed or after MPI_Finalize() has returned.

In non-MPI applications it may be necessary to start and stop Intel® Trace Collector explicitly. These calls also help write programs and libraries that use VT without depending on MPI.

VT_initialize(), VT_getrank(), VT_finalize() can be used to write applications or libraries which work both with and without MPI, depending on whether they are linked with libVT.a plus MPI or with libVTcs.a (distributed tracing) and no MPI.

If the MPI that Intel® Trace Collector was compiled for provides MPI_Init_thread(), then VT_init() will call MPI_Init_thread() with the parameter required set to MPI_THREAD_FUNNELED. This is sufficient to initialize multithreaded applications where only the main thread calls MPI. If your application requires a higher thread level, then either use MPI_Init_thread() instead of VT_init() or (if VT_init() is called for example, by your runtime environment) set the environment variable VT_THREAD_LEVEL to a value of 0 till 3 to choose thread levels MPI_THREAD_SINGLE till MPI_THREAD_MULTIPLE.

It is not an error to call VT_initialize() twice or after a MPI_Init().

In an MPI application written in C the program's parameters must be passed, because the underlying MPI might require them. Otherwise they are optional, and 0 or a NULL pointer may be used. If parameters are passed, then the number of parameters and the array itself may be modified, either by MPI or Intel® Trace Collector itself.

Intel® Trace Collector assumes that argv[0] is the executable's name and uses this string to find the executable and as the basename for the default logfile name. Other parameters are ignored unless there are special --tracecollector-args parameters. In this case all following parameters are interpreted as configuration options, written with a double hyphen as prefix and a hyphen instead of underscores (for example, --tracecollector-args --logfile-format BINARY --logfile-prefix /tmp). These parameters are then removed from the argv array, but not freed. To continue with the program's normal parameters, --tracecollector-args-end may be used. There may be more than one block of Intel® Trace Collector arguments in the command line.

See the description of the following routines:

The following functions control the tracing of threads in a multithreaded application:

The recording of performance data can be controlled on a per-process basis by calls to the VT_traceon() and VT_traceoff() routines: a thread calling VT_traceoff() will no longer record any state changes, MPI communication or counter events. Tracing can be re-enabled by calling the VT_traceon() routine. The collection of statistics data is not affected by calls to these routines. With the API routine VT_tracestate() a process can query whether events are currently being recorded.

See the description of functions:

With the Intel® Trace Collector configuration mechanisms described in the Intel® Trace Collector Configuration section, the recording of state changes can be controlled per symbol or activity. For any defined symbol, the VT_symstate() routine returns whether data recording for that symbol has been disabled.

Find the function description in the following section:

Intel® Trace Collector minimizes the instrumentation overhead by first storing the recorded trace data locally in each processor's memory and saving it to disk only when the memory buffers are filled up. Calling the VT_flush() routine forces a process to save the in-memory trace data to disk, and mark the duration of this in the trace. After returning, Intel® Trace Collector continues normally.

Intel® Trace Collector makes its internal clock available to applications, which can be useful to write instrumentation code that works with MPI and non-MPI applications.

For more detailed information, refer to the following sections:

See Also

Intel® Trace Collector Configuration

For more complete information about compiler optimizations, see our Optimization Notice.