When the sample application exits, the Intel® VTune™ Amplifier finalizes the results and opens the Locks and Waits viewpoint that is configured to display synchronization objects sorted by Wait time. To interpret the data on the sample code performance, do the following:
Click the Bottom-up tab to open the Bottom-up pane.
The table below explains the type of data provided in the Bottom-up pane:
nqueens_parallel sample code, there are two critical wait objects,
OMP Critical nqueens_IP_setqueen and
OMP Join Barrier, that caused redundant synchronization and took the longest Wait time and highest Wait count. The bar indicators in the Wait Time column indicate that most of the time for these objects processor cores were underutilized.
Analyze Source Code
Explore the source of the critical synchronization objects that caused significant Wait time and poor processor utilization. Double-click the
nqueens_IP_setqueen object to analyze the source of the
setqueen wait function. Click the button on the Source pane toolbar to go to the biggest hotspot code line in the function. VTune Amplifier highlights line 142 protected by the OpenMP* critical section.
setqueen function was waiting for 140.687 seconds while this code line was executing. During this time, this operation was contended 39,988 times.
Hover over any transition line in the Timeline pane below to explore the infotip and make sure that all the transitions are caused by the
OMP Critical nqueens_IP_setqueen critical section.
OMP Critical nqueens_IP_setqueen section is the place where the application is serializing. Each thread has to wait for the critical section to be available before it can proceed. Only one thread can be in the critical section at a time.