Developer Guide and Reference

Contents

Profile an Application with Instrumentation

Profiling an application includes the following three phases:
This topic provides detailed information on how to profile an application by providing sample commands for each of the three phases (or steps).
  1. Instrumentation compilation and linking
    Use
    [Q]prof-gen
    to produce an executable with instrumented information included. Use
    /Qcov-gen
    (Windows) option to obtain minimum instrumentation only for code coverage.
    Operating System
    Commands
    Linux and
    macOS*
    icpc -prof-gen -prof-dir/usr/profiled a1.cpp a2.cpp a3.cpp
    icpc a1.o a2.o a3.o
    Windows
    icl /Qprof-gen /Qprof-dirc:\profiled a1.cpp a2.cpp a3.cpp
    icl a1.obj a2.obj a3.obj
    Windows
    icl /Qcov-gen /Qcov-dirc:/cov_data a1.cpp a2.cpp a3.cpp
    icl a1.obj a2.obj a3.obj
    Use the
    [Q]prof-dir
    or
    /Qcov-dir
    (Windows) option if the application includes the source files located in multiple directories; using the option insures the profile information is generated in one consistent place. The example commands demonstrate how to combine these options on multiple sources.
    The compiler gathers extra information when you use the
    -prof-gen=srcpos
    (Linux and
    macOS*
    ) or
    /Qprof-gen:srcpos
    (Windows) option; however, the extra information is collected to provide support for specific Intel tools, including the code coverage Tool. If you do not expect to use such tools, do not specify
    -prof-gen=srcpos
    (Linux and
    macOS*
    ) or
    /Qprof-gen:srcpos
    (Windows); the extended option does not provide better optimization and could slow parallel compile times. If you are interested in using the instrumentation only for code coverage, use the
    /Qcov-gen
    (Windows) option, instead of the
    /Qprof-gen:srcpos
    (Windows) option, to minimize instrumentation overhead.
    PGO data collection is optimized for collecting data on serial applications at the expense of some loss of precision on areas of high parallelism. However, you can specify the
    threadsafe
    keyword with the
    -prof-gen
    (Linux* and
    macOS*
    ) or the
    /Qprof-gen
    (Windows) compiler option for files or applications that contain parallel constructs
    using OpenMP* features
    , for example. Using the
    threadsafe
    keyword produces instrumented object files that support the collection of PGO data on applications that use a high level of parallelism but may increase the overhead for data collection.
    Unlike serial programs, parallel programs
    using OpenMP*
    may involve dynamic scheduling of code paths, and counts collected may not be perfectly reproducible for the same training data set.
  2. Instrumented execution
    Run your instrumented program with a representative set of data to create one or more dynamic information files.
    Operating System
    Command
    Linux and
    macOS*
    ./a1.out
    Windows
    a1.exe
    Executing the instrumented applications generates a dynamic information file that has a unique name and .dyn suffix. A new dynamic information file is created every time you execute the instrumented program.
    You can run the program more than once with different input data.
    By default, the
    .dyn
    filename follows this naming convention:
    <timestamp>_<pid>.dyn
    . The
    .dyn
    file is either placed into a directory specified by an environment variable, a compile-time specified directory, or the current directory.
    To make it easy to distinguish files from different runs, you can specify a prefix for the
    .dyn
    filename in the environment variable,
    INTEL_PROF_DYN_PREFIX
    . In such a case, executing the instrumented application generates a
    .dyn
    filename as follows:
    <prefix>_<timestamp>_<pid>.dyn
    , where
    <prefix>
    is the identifier that you have specified. Be sure to set the
    INTEL_PROF_DYN_PREFIX
    environment variable prior to starting your instrumented application.
    The value specified in
    INTEL_PROF_DYN_PREFIX
    environment variable must not contain
    < > : " / \ | ? *
    characters. The default naming scheme will be used if an invalid prefix is specified.
  3. Feedback compilation
    Before this step, copy all .dyn and .dpi files into the same directory. Compile and link the source files with
    [Q]prof-use
    ; the option instructs the compiler to use the generated dynamic information to guide the optimization:
    Operating System
    Examples
    Linux and
    macOS*
    icpc -prof-use -ipo -prof-dir/usr/profiled a1.cpp a2.cpp a3.cpp
    Windows
    icl /Qprof-use /Qipo /Qprof-dir:c:\profiled a1.cpp a2.cpp a3.cpp
    This final phase compiles and links the sources files using the data from the dynamic information files generated during instrumented execution (phase 2).
    In addition to the optimized executable, the compiler produces a pgopti.dpi file.
    Most of the time, you should specify the default optimizations,
    O2
    , for phase 1, and specify more advanced optimizations,
    [Q]ipo
    , during the phase 3 (final) compilation. The example shown above used
    O2
    in step 1 and
    [Q]ipo
    in step 3.
The compiler ignores the
[Q]ipo
or
[Q]ip
option during phase 1 with
[Q]prof-gen
.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804