Thread Safety Analysis


DreamWorks Animation seeks to thread complex rendering applications that were written before threading was commonplace.  This article shows a technique to find and fix thread safety issues by executing legacy code in a threaded test harness and monitoring execution with Intel developer tools.


Our engineering engagement with DreamWorks Animation involved introducing thread parallelism in performance critical regions of various applications.  There are literally hundreds of libraries developed by DreamWorks Animation.  During parallelization, we discovered that some of these libraries were not thread safe and had to be addressed first before we could continue our work.  Here I describe a methodology developed working with DreamWorks Animation to investigate thread safety issues using Intel® software development tools.  I’ll discuss how to effectively use the tools together and highlight features that are useful when analyzing serial code to determine if it can be safely executed in parallel.  The tools described are available in Intel® Parallel Studio XE 2011and include:

  • Intel® Inspector XE 2011 (Formerly Intel® Thread Checker),

  • Threading diagnostics available in the Intel® C++ Compiler, and

  • Code coverage tool provided with Intel® C++ Compiler.


First, I used the threading diagnostics available with the Intel® C++ Compiler to identify access to global variables that need to be protected when calling in threaded code.  This has the benefit of analyzing all of the source code independent of workload.  Next, OpenMP* was used to execute existing serial code in parallel so that Intel® Inspector XE could identify thread safety issues in serial code that we want to run in parallel.  The code coverage tool provided with Intel® C++ Compiler was used to show what part of the source code was executed when running Intel® Inspector XE.  However, the code coverage tool doesn’t indicate if the code was executed in parallel or not, that must be done by code inspection.  I used the tools to identify threading errors, and verify source code changes resolved these issues.  What follows is how I used these tools for thread safety analysis.

Intel® Inspector XE 2011


Intel® Inspector XE 2011 (formerly Intel® Thread Checker) detects memory and threading defects, and is the main tool used to test thread safety.  Intel® Inspector XE finds threading errors by instrumenting code that is executed in parallel, and therefore it cannot detect threading errors in code not exercised by the workload or test.  Figure 1 shows the results on a sample program (see Figure 5 for sample code) that illustrates some of the thread safety issues encountered when making legacy serial libraries thread safe.  Intel® Inspector XE shows a read/write race condition on access to the C++ member data curState.  The two source windows on the left show the function set_state being executed by two threads.  On the right side of the figure, the call stack shows how the thread safety issue was reached.  This is very useful when working on large applications, because it helps you understand what parts of the application have thread safety issues.  In this simple example code the call stack is quite obvious.  It is important to note that Intel® Inspector XE instruments the code to find threading errors thereby increasing the run time of the application.  Hence it is best to use a smaller workload that exercises your code.



Figure 1 - Screenshot of Intel® Inspector XE showing a data read/write race error.

Using OpenMP* to Test Thread Safety of Legacy Serial Code


A quick and effective way to execute legacy code in parallel in order to detect threading errors with Intel® Inspector XE is to use appropriate OpenMP* constructs.  An example is shown in Figure 2, where the OpenMP pragma “omp parallel” is used to create a parallel region to call the function LegacyLibTest in parallel, which could then be analyzed with Intel® Inspector XE.  When you build with the OpenMP compiler option, -openmp for the Intel® Compilers, the OpenMP run-time will take care of all of the mechanics of thread creation, and executing  the function in multiple threads.  A major advantage of using OpenMP is that you maintain the serial semantics of the program by removing the OpenMP pragma or not building the application with the OpenMP compiler option, in which case the OpenMP pragma will be ignored.  OpenMP has very wide support in recent versions of C and C++ compilers.


void LegacyLibTest()

{

  TestFunction1(); //function in legacy library

  TestFunction2(); //function in legacy library

  // …

}

#pragma omp parallel

//region below will be executed by multiple threads

{

  LegacyLibTest();

}


Figure 2 - Using OpenMP parallel region to parallelize legacy serial code.

 


Intel® C++ Compiler Threading Diagnostics


The Intel® C++ Compiler has compile time diagnostics to identify reference, assignment, and address taken of statically allocated variables.  These are useful for finding references to global variables in your legacy code that will need to be protected against concurrent access when executing in parallel.  These diagnostics are enabled via compiler option: icpc –diag-enable thread.  Figure 3 shows the threading diagnostic output for the sample code shown in Figure 5.  For this simple example, the diagnostics are not threading errors.  However, these diagnostics were useful for identifying global variables at DreamWorks Animation, and it can be useful to address these before doing an analysis with Intel® Inspector XE.   Figure 4 shows how these diagnostics can be disabled for code that generates unwanted diagnostics.


$ icpc -c -openmp state.cc -diag-enable thread

state.cc(43): warning #1710: reference to statically allocated variable "MAX"

    float W[MAX], result = -1.;

            ^

state.cc(44): warning #1710: reference to statically allocated variable "MAX"

    for (int i=0;i<MAX;i++) W[i] = (float)i;

                   ^

state.cc(47): warning #1710: reference to statically allocated variable "MAX"

    for (int i=0; i<MAX;i++) {

                    ^

state.cc(52): warning #1710: reference to statically allocated variable "MAX"

      result = doWork(W, MAX);   

                         ^

state.cc(53): warning #1712: address taken of statically allocated variable "std::cout"

      std::cout << "result = " << result << std::endl;

      ^


Figure 3.  Intel® Compiler threading diagnostics output on sample code.


__pragma(warning(disable:1710))

// code that generates unwanted diagnostics

__pragma(warning(default:1710))


Figure 4.  Example of how to disable Intel® Compiler threading diagnostics.

Intel® C++ Compiler Code Coverage Tool    


The Intel® C++ Compiler provides a code coverage tool, codecov that uses the compiler’s profile guided optimization technology to generate HTML pages of your source code annotated with coverage information and a summary report at the file, function, and basic block levels.  A basic block refers to a block of assembly code the compiler generates with a single entry and single exit, and allows the code coverage tool to provide finer level of detail than one line of source code.  This provides additional details on what parts of your code was covered by the workload(s) you ran with.  By running the code coverage tool with the same workload(s) used with Intel® Inspector XE, you can determine what parts of your code Intel® Inspector XE analyzed.  However the code coverage tool doesn’t show if the code was executed in parallel, which needs to be determined by code inspection. 

The code coverage output for the sample program is shown in Figure 5, with code is highlighted with different colors: coverage code (grey), uncovered basic blocks (yellow), and uncovered function (pink).  The different colors can be customized via command line options.

Intel® Compiler code coverage tool output.

Figure 5 – Intel® Compiler code coverage tool output.

 


Summary


I discussed the methodology developed working with DreamWorks Animation to investigate thread safety issues using Intel® software development tools, and how the tools can be used together to increase their effectiveness.  It’s important to note that using these tools will not guarantee there are no threading errors in your application due to not testing all of your code with the tools, limitations in the tools, and that there may be unforeseen problems when using the tools on today’s large and complex applications, but it is my hope that these techniques will be useful in improving the thread safety of your code.
For more complete information about compiler optimizations, see our Optimization Notice.
Tags: