The Importance of a Sequential Backbone

There seems to be a widespread notion that in order to do parallel programming, you have to write an inherently parallel program.  By inherently parallel, I mean one that must have more than one thread to run correctly.  A simple example is a producer-consumer program with a bounded buffer.  A single thread cannot execute such a program by running the producer to completion first (buffer might overrun), or running the consumer to completion first (nothing to consume!).

Thus instead of sequential vs. parallel, consider three categories:

    1. sequential programs

    1. sequential programs with compute-head parallelism

    1. inherently parallel programs

I'm a big fan of category 2, because it enables me to debug the single-threaded case first.  That lets me eliminate most of the logic errors without parallel pain.  Then I crank up the number of processors and remove the parallel bugs.  Programs in category 3 are inherently painful to debug from the start.  Furthermore, they have performance pitfalls when the number of required threads exceeds the number of hardware threads, because then time slicing artifacts can hit hard.

Packages like OpenMP, Intel® TBB, and Cilk are designed for category 2.   They do not guarantee any parallelism; they merely provide it when it is available.  The extra threads execute tasks ahead of time that would otherwise eventually be executed by the main thread.   Optional parallelism makes the program easier to understand and explain.  You can understand the program's sequential execution first, and then understand how the look-ahead computations don't upset it.

The distinction between Category 2 and category 3 needs to be widely understood.  Perhaps Category 2 needs a catchy name?

For more complete information about compiler optimizations, see our Optimization Notice.


anonymous's picture

Sequential-with-benefits requires that a valid sequential execution exists it does not require that it be used in practice. Therefore Amdahl's law is no more an issue for sequential-with-benefits than any other parallel program.

For example, "for(int i=0; i<N; ++i ) ++a[i]" has a valid sequential execution, yet it can be executed with N-way parallelism and no sequential portion on a PRAM.

anonymous's picture's_law

Arch D. Robison (Intel)'s picture

A point I want to clarify is that sequential-with-benefits does not mean automatically parallelized. I still expect the programmer to be writing an explicitly parallel program. The distinction here is that a sequential-with-benefits program has a legal sequential order for its tasks, whereas an inherently parallel program does not.

I think that inherently parallel programs would be an even worse nightmare for massively parallel machines. If a program inherently requires a minimum of 10,000 threads to run correctly at all, it would seem very difficult to understand and debug.

In contrast, a sequential-with-benefits (to use Clay's term) program can be debugged sequentially first, and then the number of threads cranked up, perhaps two or four threads at first to get the basic parallel bugs to show (and fix), and then cranked up to 10,000 threads to get the performance problems to show (and fix).

anonymous's picture

Didn't realize it will truncate my comments. FUll comments at

anonymous's picture

I agree with the taxonomy -- may be Category 2 should be called sequential-parallel programming or even sequentially programmed -- parallel via automation !

The Category 3 -- Inherently parallel programming, IMHO, should deal with massively parallel architecture -- for the current generation (i.e.

Clay B.'s picture

New name for Category 2, eh? How about "Sequential with benefits"?

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.