Testing Parallel Programs

By Cesar Martinez (Intel)

A constant in computer sciences is that the world’s hunger for faster performance is never satisfied. Today, these performance demands are not only for speed, but also for smaller and more powerful mobile devices. In an attempt to satisfy increasing user expectations, the company IDC anticipates that OEMs will select faster multi-core processors to drive their devices [1]. Within this context, parallel programs will be massively used in mobile equipment to make the most of the multi-core technology.

Testing is one of the most expensive phases of the software development cycle. Large software vendors spend 50 % of their development cost on testing [2]. If we consider parallel software, this cost is incremented. Such budgetary cost makes the selection of an efficient and appropriate testing methodology essential to any software development strategy.

In previous articles [3], we discussed the tips to port an application from serial to parallel programming and how to prepare for parallel optimization. The intention of this article is to provide an introduction to the different parallel software testing methodologies.

Typically, a parallel program can be tested with three high level methods:

Black-box: this method tests the functionality of an application without accessing specific knowledge of the parallel application’s code. A limitation to Black-box testing is that we do not know how much of the code has been covered. To overcome this drawback, test data adequacy criteria can be used to plan what would like to be tested. On the other hand, an advantage of this method is that it can be easily automated and it involves less effort than white-box testing as it does not use information about the program structure.

White-box: this method tests the parallel source code and algorithms used to implement the software functionality. In white-box testing, an internal perspective of the system, data structure, and parallel programming skills are required and used to design test cases. The tester should plan which inputs to use in order to exercise paths through the code and determine the appropriate outputs and; therefore, the correct behavior of the functionality. This method requires not only effort, but skilled resources in order to be executed.

While white-box testing can be applied at different system levels of the software testing process, it is usually done at the unit level. It can test paths within a unit, paths between units during integration, and between subsystems during a system-level test. Though this method can uncover many errors or problems, it might not detect unimplemented parts of the specification or missing requirements.

Mixed: this method is the combination of the black-box and white-box method. This is commonly used when some parts of the parallel algorithms are implemented using proprietary libraries or for testing strategic purposes.

Testing Strategies

There are four main strategies to test a parallel program:

Stress testing: refers to tests with a greater emphasis on robustness, availability, and error handling under a heavy workload, rather than on what would be considered correct behavior under normal circumstances. In particular, the goals in parallel testing are to identify rare interleavings and to ensure that the software does not crash in conditions of insufficient computational resources.

Systematic testing: refers to a complete conformance testing approach to software testing, in which the tested unit is shown to conform exhaustively to a specification, up to the testing assumptions [4]. One way to implement it is using a model checker mechanism. This tool helps to systematically explore different thread interleavings for a given input data.

Randomized testing: the goal of this testing strategy is to randomly pick interleavings for testing the behavior. A concept to highlight is that of active testing [5]. Active testing works in two phases: first, it uses predictive off-the-shelf static or dynamic program analyses to identify potential concurrency bugs, such as data races, deadlocks, and atomicity violations. In the second phase, active testing uses the reports from these predictive analyses to explicitly control the underlying scheduler of the concurrent program to accurately and quickly discover real concurrency bugs, if any, with very high probability and little overhead.

Heuristic-driven testing: using this strategy the tester carefully plans schedules that are suspected to be buggy. Generally, it requires understanding the program mainly from a profiling and analysis perspective.

Planning and Measuring

The first step in the parallel application testing is to set the performance expectations appropriately by defining the acceptable performance thresholds. These thresholds should be defined in the project planning and will serve as the product non-functional requirement to be accomplished.

During the thresholds definition, the architect should determine the metric to be gathered and tested in the testing life cycle. There are different metrics to measure application performance: wall-clock time for a single job (also called turnaround time), wall-clock time for multiple jobs (throughput measurements), MFLOPS (million floating-point operations per second) rating, memory usage, I/O utilization, MIPS (million instructions per second), network usage, and others. Hence, based on the characteristics of the application, it is important to decide what metric to use.

Moreover, an important point in testing planning is the definition of the dataset that will be used in the test cases execution. Datasets and the execution environment should be carefully selected and should adequately represent the use of an application. Benchmark measurements should be performed using the same datasets and setup environment. It is also suggested to change only one variable (for example, compiler flag or system setting) from one experiment to another. This facilitates correlating the effects of a change in performance to changes made in performance variables. Wherever possible, data obtained with different tools or techniques must be cross-checked.

In conclusion, testing is one of the most expensive phases of software development. Therefore, the selection of an efficient and effective strategy is important. The methods and techniques introduced will help to define the most appropriate testing approach for your specific project.

For more information about parallel programming and tools for measuring concurrency of your applications, please visit /parallel

[1] World wide Mobile Phone 2011–2015 Forecast Update: September 2011. Ramon T. Llamas William Stofega

[2] Parallel Execution of Test Runs for Database Application Systems, Florian Haftmann Donald Kossmann, Eric Lo


[3] Preparing for Parallel Optimization. Diana Byrne, July 22 2011.
/en-us/articles/preparing-for-parallel-optimization

[4] A theory of regression testing for behaviourally compatible object types, Software Testing, Verification and Reliability J H Simons, UKTest 2005 Special Issue, September, eds. M Woodward, P McMinn, M Holcombe and R Hierons (Chichester: John Wiley, 2006), 133-156.


[5] CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs; Pallavi Joshi, Mayur Naik, Chang-Seo Park, and Koushik Sen


For more complete information about compiler optimizations, see our Optimization Notice.