Three Things You Need to Teach About Parallel Programming

By Clay Breshears (Intel) (133 posts) on June 19, 2008 at 9:35 am

I was presented with a challenge the other day.  The question was, "What three things do you think must be taught about parallelism in universities, such that, by not teaching these three things professors would be doing a disservice to their students?" A further restriction put on this question was that each topic could only take half an hour or less to cover. Below is a quick description of the lectures that I came up with, in the order of presentation.

Part 1 - Recognizing Parallelism. This lecture points out why parallelism is important in today's software development environments and how to recognize 2 types of opportunities for parallelism (task and data decomposition) within application code. Examples could start with real-world examples of processes that are "executed" in parallel and then provide a programming example.

Part 2 - Shared Memory and Threads. This lecture give more ideas about how the models covered in the previous part are found within serial code, how threads would be used to express that parallelism, and the major concerns programmers need to be aware of when working in shared memory with threads (i.e., data races).

Part 3 - OpenMP Programming.This lecture gives a quick introduction on how to parallelize loops with independent iterations using OpenMP and how OpenMP deals with some of the concerns expressed in the previous lecture. This is intended to give students a chance to actually do some simple parallel programming and see the performance benefits of parallel execution.  

Of course there are many more than three things that I would want to put into a university curriculum about parallel programming. If nothing else is ever taught on the topic in a student's academic career, this minimal set prepares for their own explorations of alternate threading models (e.g., Windows Threads, Pthreads, Intel(R) TBB) or alternates to shared-memory parallel programming (e.g., message-passing, functional languages).

Proposals for adding parallel programming to a Computer Science curriculum frequently gets pushback asking "What do we eliminate from our already over-stuffed curriculum in order to add parallel programming?" The half hour limit is intended to address this concern. I see these three lectures dropping into a first programming class (CS1) that used a language supporting OpenMP, right after loops and loop constructs were discussed. Ninety minutes out of a semester of contact (60 minutes, 3 times a week, for 15 weeks = 2700 minutes) in such a course could be set aside for these topics.

To paraphrase Dennis Miller, "...of course, this is just my opinion. I could be wrong." Does this list cover the minimum set of parallel programming knowledge? Could it be done with only two topics or is there a fourth (or fifth) vital element that I've overlooked? Is shared-memory programming on multi-core processors the way to introduce/practice these topics or would this just scar future programmers? If you were to answer the original question at the top of this post, how would you respond?

Categories: Academic, Parallel Programming

Comments (10)

June 19, 2008 1:47 PM PDT

Dan Ernst
Dan ErnstTotal Points:
25
Registered User
Going back to the original question - ("What three things do you think must be taught about parallelism in universities, such that, by not teaching these three things professors would be doing a disservice to their students?") - I think your list focuses a bit much on technologies and not enough on parallelism.

I do think you hit the first point on the head - show them decompositions and give examples. The second point I would change to something more generic, like "Exploiting parallelism". Give them a hands-on with whatever model you have handy, threads/shared mem with C++/Java, message passing/libraries with Erlang, even MapReduce with Scheme. I think the key point here is to integrate it into whatever programming environment you're already using - making it just an extension to their current environment.

I would consider a third point about generic communication between parallel tasks - something you lump into point 2. (I'd also call you a miracle worker if you can cover a language thread model AND race conditions in 30 minutes in CS1!)

Hooking our students to a particular technology has positives AND negatives - it may help them get a job on that technology straight out of college - but without a grounding in a more generic background, it may limit their ability to adapt as the model changes (and it will be changing!)

It's even very possible, given the rate at which the field is moving towards manycore, that the shared mem/threaded model may move out of favor by the time that current CS freshman even complete their degrees! (See the comments of Berkeley's Edward Lee, among others...)

I do definitely agree that the foundational material for parallelism/concurrency doesn't really take much class time.
June 20, 2008 9:17 AM PDT


Michael Wrinn
A "3 things" approach is nicely compact, and this particular list offers concrete examples (ISC has content - presentations and labs -- on all of them, available to any member here wishing to use it). I'd reduce it further: The *Two Things* you need to teach about concurrency: concept -- how to think of it, and implementation -- what to do about it. In this reduction, Clay's first point addresses part of the concept, his second and third, implementations.

Approaching the matter with these twin requirements -- concept and implementation -- leaves room for courseware to evolve, as the models become more crisp, and the implementation techniques improve. Dan is on target in warning against too great a dependency on the method du jour (in industry, the current default is shared-memory/shared-state threading, but, again as noted by Dan, this may eventually be bypassed by more reliable approaches).
June 23, 2008 11:44 AM PDT


Charles E. Leiserson
In my opinion, the most important thing to teach students is what parallelism is and how to quantify it. See my blog "What the $#@! is Parallelism, Anyhow?" at http://www.cilk.com/multicore-blog/bid/5365/What-the-is-Parallelism-Anyhow.
June 23, 2008 5:43 PM PDT


Patrick Madden
I'd strongly advocate having Amdahl's Law in the mix, right at the start. All of us who know something about the field are aware that for many applications, we won't get anywhere near a factor of N speedup with N processors. Most of the current examples that are bandied about (ray tracing, high volume web servers, etc) are embarrassingly parallel anomalies.

If we're not candid about the prospects for performance scaling, we risk alienating many programmers. Suppose you have a enthusiastic programmer; he or she bites the bullet, sweats hard to learn a new language or API, recodes an entire application from the ground up in a very careful manner, and then finds... 1.5x speedup with 4 cores. And then perhaps 1.6x on 8 cores. Wouldn't this person feel like they were misled, lied to, sold a bill of goods?

Prof. Leiserson's blog post on parallelism (the span and work ... laws) hits the mark. There's a realistic sequencing of tasks, some opportunities to use multiple processors, but a 2X speedup for the example, no matter how many processors you throw at it.

Complexity theory should also be covered early. It's easy to speed up a bad algorithm, but frequently hard to speed up a good one; students should know not to swap a serial O(n log n) algorithm for a parallel O(n^2) algorithm.

One might note that parallel programming (called multiprocessing and multiprogramming back in the day) was part of the ACM Computing Curriculum starting in 1968: http://portal.acm.org/citation.cfm?id=362929.362976&coll.....N=72793383
Parallel courses have been taught at most Universities since the dawn of computing; I took three of them as an undergrad in the '80s. I don't expect any major sea change to happen by putting more emphasis at the University level.
June 23, 2008 11:49 PM PDT


Tony Targonski
Unfortunately it seems that many University students are having enough trouble with their loop constructs as it is (while others could be teaching those introductory classes). As interesting as this might be, I don't think parallel programming will work in CS1. Though after the bottom 40% (or whatever the appropriate figure is) quits pursuing their Computer Science degree after the 2nd year, all is fair game.
June 30, 2008 8:41 AM PDT

Clay Breshears (Intel)
Clay Breshears (Intel)Total Points:
16,177
Status Points:
16,177
Black Belt
The original challenge had the implication that these 3 Things would be the *only* thing I could teach about parallelism and concurrent programming during the entire undergraduate curriculum. Also, trying to look at this as being something one could implement at any level of university or college--from MIT to regional universities to Podunk Community College & Tractor Repair School (Go Fightin' Truck Farmers!). Thus, my heavier emphasis on implementation, with the reservation that this implementation must fit into whatever language is being used. (I chose OpenMP since I feel it is easier to teach the basic syntax in OpenMP than it is for the corresponding features of a native threading method or a message-passing library.)

Michael has split it well into two categories (concepts and implementation). However, if we don't teach the "method du jour," we would only be teaching theory and assumptions of what might be the next thing. With the restriction of a limited amount of contact time, we need to give students some kind of usable information or marketable skill. Teaching some yet-to-be-decided phantom shared-none method is a non-starter, no matter the advantages of such a programming method over a shared-any method. When the dust has settled and we know what the next method du jour is, then we adjust the lectures to use that method and its ramifications.

While academics have known about concurrency and parallelism for several decades and the ACM has included such content in it's model curricula, a blog from the recent IPDPS (http://software.intel.com/en-us/blogs/2008/05/02/parallel-co.....-curricula) would indicate that such topics are actually disappearing from CS curriculums. I remember considering concurrency in an Operating Systems course back in my bachelor degree days, but that was the only thing we had in the undergraduate curriculum. There was one graduate course I was able to take in my master's program. It piqued my interest and I followed through with more research and projects on my own. At Tennessee there was much heavier emphasis on parallel computation and more opportunities, of course, but I don't think there was anything required at the undergraduate level. I did teach the Parallel Algorithms course at Southern Miss as part of their HPC doctoral program. So there are pockets of parallelism out there, mostly at the graduate levels, though.

Since multi-core processors are now ubiquitous the instruction of how to program these processors must also be ubiquitous within CS curriculums. It's disheaterning to hear about students struggling with loops, but I don't think we can afford to put parallelism and concurrent programming on hold for two years until we're left with the "serious" students. Prof. Ernst's paper (Concurrent CS: Preparing Students for a Multicore World (http://www.cs.uwec.edu/~ernstdj/Papers/ITICSE08.pdf)) would argue that concurrent programming can be introduced into CS1. I expect that Dan takes more than 90 minutes in the course of the semester to cover the programming methods, though.

Much like writing haiku or a 100-word short story, restricting yourself to teach only three things that must then influence a student to acknowledge/understand/embrace the rudiments of parallel and concurrent programming for the next four years of their academic life is a good mental exercise. It should also illustrate that there is something that can be done at any university or college that will not disrupt the current curriculum nor will it require the sacrifice of major parts of established courses. I would hope that more than these three things would be allowed a place in the undergraduate degree program. Even if it only gets curriculum committees to dust off the 1968 ACM curriculum outlines and begin to reinstate those parallel programming courses or even to start seriously thinking about how to weave parallelism into courses, starting at the freshmen level, this mental exercise will have served its purpose.
August 26, 2008 10:38 PM PDT


John Phillips
Patrick -

I've come to the conclusion that although Amdahl's Law is obviously important it gets overemphasized or misinterpreted frequently. The problem I have with it (and I think Herb Sutter has made a similar point, to try not to steal from anyone) is that it ignores the context of the work being done.

Few companies with a working and selling product are going to parallelize it just to try and make it go faster. (By the way, my entry into this field was via scientific simulation, where we parallelize just to make it go faster.) That is development money spent with little reason to think there will be a practical return. They will parallelize to add to it. This means that Amdahl is probably not the right approach, and something like Gustaffson's law is better. How much extra work could be done in the same time by parallelizing? I think this is a better starting point for a new student. (However, in a reply on a different blog I say that I don't think the concepts are clearly enough understood to be sure how to teach a new student.)

If I'm correct, then this should be Clay's first point. What is parallelism and how can it help you do things you couldn't do before? This approach gets students thinking of dependencies between potentially separate operations as a bad thing that should be avoided when possible and contained when they can't be avoided. Not only is this good for future parallel work, but it is usually good design in any code.

It naturally motivates his first point as a follow up. If task or data separations are good, how can I recognize them in an algorithm? More importantly, how can I recognize them in a problem. This addresses the point that a good sequential algorithm could be a good, indifferent, or really bad parallel algorithm. This is the time to introduce Amdahl, since it gives a rule for understanding what makes a good sequential algorithm a bad parallel one.

I would then move to coping mechanisms for coupled processes. It could introduce the ideas of transactions, atomics, locking and message passing so students have some idea of each.

Each section should be accompanied by a few examples that provide templates for good implementations of simple problems and are available for the students outside class, and a few small projects where the students do something similar to (but not identical to) the examples to see what the range of permutations is for those methods. If the CS1 class uses C/C++, then the first examples could be introduced just after loops and use OpenMP. Simple thread and MPI examples could follow when appropriate. (I would tend to use C++ and boost for these, but others can do other things.) Other bases for the language used in the class would change the tools used, but with some effort useful tools can be found for many languages. (Probably not all, but enough.)

John Phillips
August 27, 2008 10:34 AM PDT

Clay Breshears (Intel)
Clay Breshears (Intel)Total Points:
16,177
Status Points:
16,177
Black Belt
John -

Excellent idea for motivation. Being from Intel, I can often get into the mindset that parallelism must be studied because that's what we've got now. A better approach is to first show off the benefits of parallelism. (As an analogy, showing a few real-world examples on applying groups, fields, and rings would motivate me to study abstract algebra better than just jumping into Abelian groups or fields of functions.)

I try to use Amdahl's Law for only one thing: making an initial estimate of potential speedup. Once the relationship between serial execution time and speedup is understood, and the benefit of parallel execution to be able to tackle bigger problems than might efficiently be run on serial machines, Gustaffson's Law is introduced.

Your point about using Amdahl to demonstrate whether or not a given serial algorithm would make a good parallel algorithm is a logical usage, and I'm embarassed that I'd not thought of that application before. I'll have to try that when I can.
September 18, 2008 1:37 PM PDT

Paul Steinberg (Intel)
Paul Steinberg (Intel)Total Points:
7,225
Status Points:
7,225
Community Manager
OK Clay -Let's put you and your Three Important Things to the test! We'll make a series of three videos of you teaching this material. We can also provide sample code and knowledge checks as well. We'll put the entire package on the Wiki for instructors to use in building their curriculum. They can use the posted components in their classes or they could just use the videos as reference tools when constructing their lectures.

Actually - full disclosure - Clay and I have been discussing this for a while. We’ve also discussed it some with academic colleagues, most of whom have been very interested and supportive of the idea. Before we proceed further, we’d like to open this up for a more general discussion and get as much input as possible. You can start by going to the Academic Community Wiki and taking a look at the base content for our videos (http://softwarecollege.intel.com/wiki/Main_Page#Online_Video_Content). Let’s make this a collaborative effort.
If this proves a worthwhile experiment, we can move to producing video content in other course domains as well.

I’m including a portion of an email from Dr. Ana Balevic at the University of Stuttgart (with her kind permission).
. . . Making videos of your computer architects giving the courses is a brilliant idea. There are so many materials on the internet that it's hard to go through it in a limited time. Selected and focused videos with examples, would be a great choice for reducing time to master both the topic and how to present it to students. For example, NVidia has already prepared something like that for their CUDA (around 10x 1h lectures), and I find it very helpful for learning GPGPU programming. I would be very happy to "test" the videos - please send me the links when they are available, and I'll go through them as soon as I have some free time. One special plea that I have is to make all these contents downloadable – so that I can watch videos even if I do not have high-speed internet connection. The proposal for the first contents to be filmed sounds good, and it would be worth extending it with some more advanced programming examples . . .

Thank you Ana for this input. I am looking forward to hearing from others as well.
March 13, 2009 5:54 PM PDT


Jennifer Levine
A video lecture series was created to cover these three items in 30 min! Great stuff. http://software.intel.com/en-us/blogs/2008/12/19/video-lectu.....ust-teach/


Trackbacks (1)


Leave a comment  

To obtain technical support, please go to Software Support.
Name (required)*

Email (required; will not be displayed on this page)*

Your URL (optional)


Comment*