Identifying implicit locks and their impact

Sometimes your application may not exhibit the parallelism that you expected because of implicit locks within your application. Implicit locks are those locks that are acquired because of system calls versus synchronization objects that you have explicitly implemented to protect globally accessed data.

One way to find these implicit locks is to use Concurrency analysis. After collecting Concurrency data on your application, the results may look like this:

image1+%2528Small%2529.JPG

This example shows a lot of serial time in ShowProgress() (note relative size of orange bar). If we turn off the “Assign system time to user calling function” feature via the highlighted button

image1-3.JPG

the Parallel Amplifier now shows that most of our non-parallel time is in KFastSystemCallRet().  Reviewing the stack data in the Call Stack pane on the right, we see where execution left our code, ShowProgress(), and entered system code, namely, printf().  (The system calls are shown in grey in the image below.)  Thus, the non-parallel time is the result of calling printf(), which results, eventually, in calling KFastSystemCallRet().

image2-2.JPG

Double-clicking on ShowProgress() in the Call Stack pane will take us to the source code where the call to the system code was made:

image3-2.JPG

In this example, removing the call to printf() validates our conclusion (notice that ShowProgress() no longer appears in the list of functions with wait time):

image4-2.JPG

The concurrency summary shows an increase in parallelism (1.07 vs. 1.54), as well:

image1-4b.JPG

Of course, you can’t always remove calls to system functions, but this example verifies that the printf() caused the code to be serialized because of an implicit lock within the file I/O.
For more complete information about compiler optimizations, see our Optimization Notice.