Parallel Pattern implementation study: Structured Grids in Lattice-Boltzmann simulations

Parallel Pattern implementation study: Structured Grids in Lattice-Boltzmann simulations

Michael Wrinn (Intel)'s picture

Below is the "instructional design" description of the first in a series of implementation studies, looking for instances of Parallel Design Patterns in actual code. Feedback welcome!

1.Module Name: Parallel Pattern Implementation Studies: structured grids as implemented in the Lattice-Boltzmann project OpenLB

2. Writers: Michael Wrinn

3.Targeted availability: EOQ Q4 2008

4.Brief Module Description

Proposed duration: ~ 1 hour, self-guided

The ISC module Architecting Parallel Software described in detail the concepts of organizing software patterns into Structural and Computational categories, and linking those to a set of Concurrency Patterns; the result is the Parallel Pattern Language (PPL) being explored and refined by researchers at UC Berkeley, UIUC, and Intel. The module described here is one of a number of case studies intended to demonstrate, with working parallel programs, PPLs mapping to actual solutions.

This particular case study in pattern mining examines a category of algorithms, Lattice-Boltzmann solutions for fluid flow simulation. Compared to the fluid simulation methods currently used in game physics, Lattice-Boltzmann methods show promise as both more general and more sophisticated approaches to physics solutions for fluids and smoke; for real-time use, they require computing capability just beyond typical current desktop systems, but just within reach of anticipated multicore and manycore systems anticipated in the near future.

The specific patterns involved are: the Computational Pattern Structured Grids, and the Concurrency Patterns Data Parallelism and supporting structures Loop Parallelism, and SPMD. (please see the Architecting Parallel Software module for definitions and further explanation).

The coding examples to be used were developed in the OpenLB project under Dr Jonas Latt in Lausanne; Dr Latt has enthusiastically agreed to work with us on this project, and considers the OpenLB implementation to be an excellent example. The concurrency was implemented in two different ways, so the merits of those choices will be compared. The project has the potential, in the future, of being tested in still a third way, using Intels Ct.

The module will consist, explicitly, of descriptions of the Lattice-Boltzmann algorithms, their mapping to the proposed Berkeley Pattern Language for Parallel Programming, code fragments illustrating those mappings, and comparative performance discussion. Users will be able to download the OpenLB code, and run the illustrations themselves, at their own pace.

5.Needs Analysis

The current shift from sequential to multicore and manycore processors presents seriouschallenges to software developers. A significant part of the industrial andresearch communities continues to believe that either a) they can make do with incremental approaches, or b) theright compiler, parallel language and so forth will save them. Such ad hoc responsesare likely to prove neither correct nor sustainable. To systematically findand exploit parallelism, and to achieve forward scalability that is,designs which efficiently scale to much larger numbers of cores -- willrequire re-architecting software applications.

The key to re-architecting software is the use of design patterns and a pattern language. We divide these into Structural (akaarchitectural styles) Computational (originally presented in the View from Berkeley as thirteen dwarfs)and Concurrent (originally presented in the book, Patterns for Parallel Programming). More information, with links to updates, is available here.

This implementation study looks at a specific use of the proposed pattern language to illustrate concurrency design choices made in the OpenLB project.

6.Subject Matter Experts (SMEs):

Prof Jonas Latt (Ecole Polytechnique, Lausanne), Michael Wrinn

7.Learner Analysis

The ideal student for this module is an adult learner at a university, who in addition to exhibiting the learning characteristics of adult learners as described in ACM documentation. In particular:

  • Has some software design experience, using any standard programming language (e.g. Java, .NET, C, C++).
  • Has the ability to learn from lecture/discussion environment only.
  • Has an ability to generalize from examples.
  • Demonstrates a willingness to tackle a difficult concept and deal with complexity.
  • May or may not have an understanding of the issues of parallel programming and are at least familiar with one concurrent programming method; this implies an advanced student, typically in the 3rd or 4th year of an undergraduate program.
  • Currently instruct or plan to instruct adult students who fit in the learner description earlier in this section.
  • Currently using a successful programming curriculum, or intend to soon create or teach one.

8.Context Analysis

The purpose of a Context Analysis is to identify and describe the environmental factors that inform the design of this module. Environmental factors include:

  1. Media Selection: lecture presentation will in Microsoft* Power Point* format including speaker notes and references to more detailed content. The lab itself is a self-directed; the source code and basic documentation will be downloaded from the OpenLB project site at www.openlb.org. We will provide supplemental documentation explaining the steps to build and run the code successfully with the Intel compiler, and for controlling the parallel execution levels.
  2. Learning Activities: Lecture-only presentation; discussion of similarities and differences between models presented is encouraged between students and between students and instructor.
  3. Participant Materials and Instructor/Leader Guides: Instructor notes are included in Power Point Notes sections. Recorded presentation and lecture notes for the slides, narrated by course author, will be made available to external academics through the Intel Academic Community website.
  4. Transcript of expert delivery (audio recording)
  5. Packaging and production of training materials: Materials are posted to Intel Academic Community website, for worldwide use and alteration
  6. Training Schedule: The module is 2.5 to 3 hours of lecture.
  7. ACM curriculum positioning: anticipated to be senior undergraduate level (analysis ongoing, as this is new research).

9.Task Analysis

The primary Bodies of Knowledge (BKs) include, but are not limited to:

10. Concept Analysis

Software Design Patterns as a means to approach concurrency.

11. Learning Objectives

Given concepts and the examples (currently 6) from the module, students will be able to describe and recognize instances of:

The structured grid Computational Patterns

The decomposition Concurrency patterns (3 levels of detail)

as defined by the UC Berkeley / Intel research effort.

12. Criterion Items

Q: What are the key aspects the pattern Structured Grids, and how are they used in the OpenLB programs?.

Q: What are the criteria for selecting one or another Concurrency pattern?

Q. Are loop-parallel and SPMD implementation choices interchangeable? Does either have a strong dependency on the underlying platform architecture?.

13. Expert Appraisal

This Content Design Document will be posted to the Intel Academic Community forum with an invitation to solicit comments from readers of the forum. Additionally, new focused SME review from outside the company will be solicited and used. The planned Academic Community Advisory Council will get a look at this in Dec.

14. Developmental Testing

Planned beta material will be posted to the ISC WIKI will in December 2008.

15. Production

Upon completion and successful passing of the Product Readiness Approval in the PDT, the materials produced for this module will be posted to the Intel Academic Community (IAC) website. There they will be available for download by IAC registered participants. This short, introductory module will be taught as appropriate (by SMEs as well as CAT)

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.