Threading analysis helps identify the cause of ineffective processor utilization and shows where your application is not parallel. One of the most common problems is threads waiting too long on synchronization objects (locks). Performance suffers when waits occur while cores are under-utilized.
Threading analysis combines and replaces the Concurrency and Locks and Waits analysis types available in previous versions of Intel® VTune™ Amplifier.
Threading analysis uses user-mode sampling and tracing collection. With this analysis you can estimate the impact each synchronization object has on the application and understand how long the application had to wait on each synchronization object, or in blocking APIs, such as sleep and blocking I/O.
There are two groups of synchronization objects supported by the Intel® VTune™ Amplifier:
objects usually used for synchronization between threads, such as Mutex or Semaphore
objects associated with waits on I/O operations, such as Stream
For the most current information on available knobs (configuration options) for the Threading analysis, enter:
$ amplxe-cl -help collect threading
This example shows how to run the Threading analysis on a Linux* myApplication application:
$ amplxe-cl -collect threading -- home/test/myApplication
When the data collection is complete, do one of the following to view the result:
Use the -report action to view the data from command line.
Use the -report-output action to write report to a .txt or .csv file