English | 中文 | Русский | Français
2,556 Posts served
8,264 Conversations started
When talking to university and college professors about adding parallel programming into their curriculum as early as possible, the typical response is to agree that this should be done. However, we are invariably challenged to tell the professors what can be taken out of their current, already jam-packed curriculum to make room for parallelism. Rather than wrestle about whether non-deterministic finite state automata has more intrinsic merit than radix sort or red-black trees, I have a revolutionary and simple solution to render the whole question moot: make all programming languages execute in parallel.
If any application from any programming language, without the benefit of any additional syntax or functionality, executed in parallel, there would be no need to teach parallel programming. Every program written would already be a parallel program. When executed, each core on the system would get a copy of the executable to run with all global and static variables would be shared among all instances of the execution. I guess you might call this a Same Program, Single Data (SPSD) model.
"What about those times when there will be code that must only be executed by a single core?" you may ask. Glad you brought that up, because that is the best part of this idea. There would need to be something in all programming languages that would be able to designate regions of code in which it was critical for only one core to execute in serial (a région critique, if you will). And since we already know how to work with and teach serial programming, this feature becomes a "gimme" or a "mulligan" for current curriculums.
Think of it like taking the stairs all the time. You've been trained in how to do this, you know how to train others, and it even comes naturally to some degree. Then, suddenly, all the staircases are turned into escalators. You can take the free ride without any extra effort, but you still have the option to walk the steps of the escalator without any retraining required.
I suppose, for the sake of performance, we might also need some mechanism included to divide iterations of loops into chunks that would then be run in serial on individual cores rather than each core running all iterations in redundancy. Might also want something to do the same with blocks of code. These are all details that I leave to the programming language designers.
For now, I am atingle about the idea of not needing to teach parallel programming and being able to focus special coding cases to organizing serial programming constructs, which we already know how to do. How about you?
(To confess, this idea of only parallel execution came from an article about the Fortress programming language being developed by Sun. However, I thought it was such a good idea that we could adopt it across the board, not just within a single language, and solve all those pesky issues about how to teach parallel programming.)
| August 21, 2008 10:54 AM PDT
Charles E. Leiserson | I agree that teaching programming in parallel as a default is a good vision to aspire to, but I don't see how a simple relabeling of the issue fundamentally helps with the problem of teaching the content of parallel programming. You still have to cover races. You still have to cover atomicity. You still have to cover the basics of performance (IMO, notions of work and span -- see http://www.cilk.com/multicore-e-book/). None of that happens without actually teaching parallel programming, whether it's the default or not. If you're going to cover these concepts, it seems to me that something in the curriculum must give, or the curriculum must be modified to recognize that everyone can't learn everything in an undergraduate degree program in computer science. |
| August 21, 2008 12:53 PM PDT
Kenneth Bodin |
I think the term "students" requires a pretty strict definition if the curriculum is to be discussed. The by far largest volume of students, as well as professional programmers, are users, modifiers and adaptors of code and software tools. A much smaller fraction are developers of original code and tools, including first design, programming implementation, validation etc. Parallelism play totally different roles at these levels, as does programming languages, standard libraries etc. Developing a functional and parallel, easy to use, high performance scripting language makes alot of sense and would be extremely useful. However, those that develop this tool must know their parallelism well! My guess is that when you at Intel speak to universities, you talk to professors and researchers who are involved in educating fairly advanced programmers that aim at becoming tool developers. This is particularly true if you speak to researchers that are close to HPC and research in parallelism. This is not bad, since the volume of students they educate is fairly small and they are extremely valuable to the industry. Without them, there wouldn't be any parallel software whatsoever! A fair amount of parallelism gurus will be needed in the coming decade. For the vast volume of programmers, much better parallel tools are needed, and this will simply have to take some time - most likely 5-10 years from now until the parallel paradigms have penetrated into the large volume tools, compilers and languages. Personally I believe that efficient parallelism requires declarative models of the problem that the software is supposed to deal with. This data can then be interpreted by an engine that is totally parallel, and you can talk to this engine via a scripting language. This is also very close to functional programming. Of course, if your "engine" doesn't understand what you want to declare, or the language doesn't allow it - then you'll have to extend the engine and yet again the parallelism gurus are needed. OpenCL (and Cuda) are great tools for the tool developers, but not for the users of the higher level tools. So, we "soon" have OpenCL, so in a few years we will have software that utilizes heterogeneous parallelism, and in another few years people will start using them. So, in 4-6 years from now, we might be able to teach high level parallelism more efficiently at universities. Cheers, Kenneth Bodin HPC2N/VRlab, Umeå university |
| August 22, 2008 3:27 PM PDT
Clay Breshears (Intel)
|
Prof. Leiserson - Yes, even if we turn the tables or reverse the polarity of programming languages, we still have to worry about things like protecting variables or dividing loop interations or knowing how to identify where these things must be done. You've hit the (hidden) vein of my little ramble here. (If that wasn't as clear as I thought, I may have used too much tongue and not enough cheek.) I'm still of the opinion that we can teach the basics of parallel programming, along with a method for writing parallel code, to CS1 students. I wouldn't expect much more than being able to identify parallelism, something about protecting shared memory and a simple threading model (OpenMP or TBB). If you wanted to avoid shared-memory models, you can wander into message-passing arenas, but that quickly becomes very onerous when trying to also teach the basics of a programming language. Couldn't we weave these ideas (or just enough to do some basic things) into the appropriate places of CS1 without displacing too much of what's being taught there now? We've got a lot of books on parallel programming that are aimed at the junior, senior, and graduate level student. Maybe we need a CS1 textbook that integrates OpenMP into a "standard" C/C++ text? (Maybe we need to wait for CS2 to do TBB so that the students have mastered objects and classes.) Frankly, I'm surprised that someone hasn't already taken their own text and put in some parallelism, even if it might be to just try out the waters. I would hazard the guess that MIT students will come in as freshmen already understanding parallel programming or be able to pick it up before midterm. Therefore, we need to focus efforts toward schools at the state and regional university level. Could teaching a programming language that executes in parallel by default be any easier to teach those "parallel" topics that are needed? Before we start radical changes in current curriculums, maybe we need to wait for a language/system that doesn't suffer from the pitfalls of shared-memory programming and also doesn't require the concentration on detail that explicit message-passing requires, but can be taught as CS1 and is still a relevant language to learn for life after college. |
| August 22, 2008 3:45 PM PDT
Clay Breshears (Intel)
|
I was also going to suggest specialized degree programs/tracks. I had this when I was an undergraduate in Computer Science. With today's vast array of jobs within the CS-related spectrum, I could easily envision a track for software development or IT or theory or games/graphics. Pick 2-8 of your faculty specialty areas and set up specialized course tracks to achieve the degree. There will be a lot of overlap in the early years, but you can still have some specialization (e.g., IT probably doesn't need parallel programming experience, Theory should at least know about it before they get to a Parallel Algorithms course, and all the programming focused tracks will need it as early as possible). There are already technical universities that offer specialized graphics and gaming degrees. I should think that today's traditional colleges and universities may need to consider such specialization to attract students when competing with these techinical colleges. By doing so, whole courses and topics of study can be removed from the standard CS curriculum and parallelism can be stuck in when and where it would best be received and used. |

Dan Ernst
25
Registered User
I hope they do well - the language tools we have today are not sufficient for the long term if we want to keep the productivity curve moving in a good direction.