If you have an application that makes heavy use of libraries or software components developed independently, you may want to exclude the information not related directly to your application from the trace data. At the same time, the library developer might want to do the opposite – trace only data related to their library.
Intel® Trace Collector provides a capability to turn off tracing for functions at a certain call stack level, that is to fold them. If you want to trace calls within the folded functions, you can unfold them.
To enable folding, use the FOLD and UNFOLD keywords for the STATE, SYMBOL or ACTIVITY configuration options to select functions for folding by their name (SYMBOL), class (ACTIVITY) or both (STATE). Use the CALLER keyword to specify the function caller. See Filtering Trace Data for details on syntax.
To enable Intel® Trace Collector to profile non-MPI functions, make sure to instrument them using the compiler instrumentation or API. See Tracing User Defined Events.
Below are examples of folding for the application with four additional libraries.
General Structure of an Application Using Multiple Libraries
From the figure above, the following information may be of interest for the application and the library developers:
lib1, lib2, lib4 are called by the application. The application developer codes these calls and can change the sequence and parameters to them to improve performance (arrows "1").
lib3 is never called directly by the application. The application developer has no way to tailor the use of lib3, therefore these calls (arrows "3") are of no interest to him.
lib4 is called both directly by the application, and indirectly through lib2. Only the direct use of lib4 can be influenced by the application developer, therefore is of interest to them.
The lib2 developer will need information about the calls from the application, to component libraries (lib3 and lib4), and to system-level services (MPI). They will have no interest in performance data for lib1. The lib1 developer will have no interest in data from lib2, lib3, and lib4.
In this section folding is illustrated by giving configurations that apply to the example above. The sample libraries.c program (available at https://software.intel.com/en-us/product-code-samples) reproduces the same pattern. Its call tree looks as follows (calls are aggregated and sorted by name, therefore the order is not sequential):
By using the configuration options listed below, different parties can run the same executable to get different traces:
Application developer: Trace the application only with the top-level calls in lib1, lib2, and lib4.
Configuration file: run_splibraries_app.conf
STATE lib*:* FOLD
lib2 developer: Trace only calls in lib2, including its top-level calls
Configuration file: run_splibraries_lib2.conf
lib2 Developer, detailed view: Trace the top-level calls to lib2 and all lib2, lib3, lib4 and system services invoked by them
Configuration file: run_splibraries_lib2detail.conf
STATE Application:* FOLD STATE lib2:* UNFOLD
Application and lib4 Developers: Trace the calls in lib4 only made by the application
Configuration file: run_splibraries_lib4.conf
STATE *:* FOLD STATE lib4:* UNFOLD CALLER Application:*
It is assumed that the application, library and system calls are instrumented in the way that their classes are different. Alternatively, you can match against the function name prefix that is shared by all library calls in the same library.