On September 13th I will be participating in a panel at IDF on how the shift to parallelism will, or should, affect computer science education. My opinion is that this is a huge challenge, but one that can be met. However, it will require rethinking certain aspects of the CS curriculum, from how (and what) algorithms and data structures are introduced, to what languages are used. It is also worthwhile contemplating how and when the physical design of computers is covered, and what the role of abstraction should be---and which abstractions are appropriate.
Parallelism makes it even more difficult to write correct programs, let alone programs that perform well. However, it is vital that parallel programming be included as part of the CS curriculum broadly. This has to be done in a way that does not neglect core skills, and that ultimately enables students to create reliable, maintainable, and efficient applications.
Ultimately, the goal of CS education should be to instill in students accurate conceptual models of how programs will behave on physical computing machines, along with strong abstraction and design skills that will allow them to efficiently develop usable and efficient software for these machines. The real challenge is going to be one of balance: in a pragmatic sense, how can parallelism be covered in the limited time available for an undergraduate education? How can the necessary new concepts be covered, without sacrificing other crucial aspects of that education? What should the balance be between practical skills useful immediately and underlying concepts useful in the long run? What emphasis should be placed on performance and efficiency?
I personally feel that efficiency is important: if you don’t care about efficiency or performance, then you don’t have to care about parallelism. However, most applications of parallelism, from virtualization to high performance computing, are about achieving as much as possible with a given amount of hardware and power. Of course, this desire for execution efficiency needs to be balanced with productivity: the need to minimize development time. However, software developers (and their managers…) need to realize that unnecessarily inefficient programs are, in fact, environmental hazards, and result in extra costs, and possibly missed business opportunities. Is there a way to achieve both performance and productivity? I believe so---though the intelligent and informed use of appropriate abstractions.
In order to achieve performance, students need to have a clear understanding of the underlying hardware mechanisms in the computer architecture, and in particular what assumptions the hardware is making about programs to achieve performance. In order to achieve reliability and to construct reliable parallel programs efficiently, software developers also need to have a knowledge of good practices for constructing efficient parallel programs, as well as a practical knowledge of the tools.
These goals are complementary, not contradictory, if it is understood that the properties that the hardware requires from programs to achieve performance are relatively simple: data locality and latent parallelism, at multiple levels. Best practices, in turn, can be encapsulated in appropriate design patterns. Design patterns can be taught that have the properties of parallelism and locality, and case studies can show how they work in practice. Software developers with a good conceptual model of the hardware (and the compilers and other systems that map programs onto that hardware) can then use these abstractions intelligently to architect efficient programs.
To be an efficient software developer, one needs to use abstractions. It is not possible to code everything at the lowest level, all the time. Programming languages were invented for a reason. However, abstractions are often taught as a way to “hide information”. This is not quite right. My opinion is that abstractions should be taught as a way to automate and delegate the management of details. However, a good software engineer should, in theory, be able to understand the details, and use that knowledge to guide the selection of appropriate abstractions. “Information hiding” should not extend to a professional’s education.
As for tools and languages: these are still evolving, but what strikes me most is not the variety of parallel programming models available, but how often they are based on a common set of core design principles.
So, in my opinion, a curriculum based around appropriate and well-structured design patterns at multiple levels of abstraction, motivated by real applications and case studies of real system, can achieve the desired goal: software developers who can efficiently create sophisticated, scalable, and reliable parallel applications, and whose skills can evolve along with the technology.