User Guide

  • 2020.3
  • 07/10/2020
  • Public Content

See Also

Graphics Trace Analyzer is an instrumentation-based tool. It obtains its data from gpa_trace files, which are generated by Intel® GPA while profiling an application that has been instrumented with ITT API calls.
To get the most out of the ITT API, you need to add API calls in your code to designate logical tasks. This will help you visualize the relationship between tasks in your code, including when they start and end, relative to other CPU and GPU tasks.
At the highest level, a task is a logical group of work executing on a specific thread, and may correspond to any grouping of code within your program that you consider important. You can mark up your code by identifying the beginning and end of each logical execution chunk.
To resolve the majority of performance bottlenecks, the following API calls are enough:
  • __itt_domain_create()
    Creates a domain required in most ITT API calls. You need to define at least one domain.
  • __itt_string_handle_create()
    Creates string handles for identifying your tasks. String handles are more efficient for identifying traces than strings.
  • __itt_task_begin()
    Marks the beginning of a task.
  • __itt_task_end()
    Marks the end of a task.
The following sample shows how these four basic ITT API functions are used in a multi threaded application.
#include <windows.h> #include <ittnotify.h> // Forward declaration of a thread function. DWORD WINAPI workerthread(LPVOID); bool g_done = false; // Create a domain that is visible globally: we will use it in our example. __itt_domain* domain = __itt_domain_create(__TEXT("Example.Domain.Global")); // Create string handles which associates with the "main" task. __itt_string_handle* handle_main = __itt_string_handle_create(__TEXT("main")); __itt_string_handle* handle_createthread = __itt_string_handle_create(__TEXT("CreateThread")); void main(int, char* argv[]) { // Create a task associated with the "main" routine. __itt_task_begin(domain, __itt_null, __itt_null, handle_main); // Now we'll create 4 worker threads for (int i = 0; i < 4; i++) { // We might be curious about the cost of CreateThread. We add tracing to do the measurement. __itt_task_begin(domain, __itt_null, __itt_null, handle_createthread); ::CreateThread(NULL, 0, workerthread, (LPVOID)i, 0, NULL); __itt_task_end(domain); } // Wait a while,... ::Sleep(5000); g_done = true; // Mark the end of the main task __itt_task_end(domain); } // Create string handle for the work task. __itt_string_handle* handle_work = __itt_string_handle_create(__TEXT("work")); DWORD WINAPI workerthread(LPVOID data) { // Set the name of this thread so it shows up in the UI as something meaningful char threadname[32]; wsprintf(threadname, "Worker Thread %d", data); __itt_thread_set_name(threadname); // Each worker thread does some number of "work" tasks while(!g_done) { __itt_task_begin(domain, __itt_null, __itt_null, handle_work); ::Sleep(150); __itt_task_end(domain); } return 0; }

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804