Intel® Concurrent Collections for C++ 0.4 for Windows* and Linux*

Submit New Article

Last Modified On :   November 16, 2009 12:06 PM PST
Rate
 


What If Home | Product Overview | Technical Requirements | Online DocumentationFAQ | Primary Technology Contacts | Discussion Forum | Blog

Product Overview

Intel® Concurrent Collections for C++ provides a mechanism for constructing a C++ program that will execute in parallel while allowing the application developer to ignore issues of parallelism such as low-level threading constructs or the scheduling and distribution of computations. The model allows the programmer to specify high-level computational steps including inputs and outputs without imposing unnecessary ordering on their execution. Code within the computational steps is written using standard serial constructs of the C++ language. Data is either local to a computational step or it is explicitly produced and consumed by them. An application in this programming model supports multiple styles of parallelism (e.g., data, task, pipeline parallel). While the interface between the computational steps and the runtime system remains unchanged, a wide range of runtime systems may target different architectures (e.g., shared memory, distributed) or support different scheduling methodologies (e.g., static or dynamic). Here we provide a runtime system for shared memory systems that supports parallel execution although it is not yet highly optimized. Our goal in supporting a strict separation of concerns between the specification of the application and the optimization of its execution on a specific architecture is to help ease the transition to parallel architectures for programmers who are not parallelism experts.

New with Release 0.4.0

  • Linux* support added
    • Support for Linux* OS has been added.
  • New runtime API
    • We have developed a new C++ API that provides a natural way to program in C++ using the Intel® Concurrent Collections design methodology.  The API also includes interfaces for tuning and debugging your Intel® Concurrent Collections program. Note that this new version is not compatible with the previous versions. See the Runtime API document for detailed information.
  • Textual notation updated
    • The specifications of tag type for item collections and tag collections are now required. The optional attributes for item/step/tag collections have been changed. See the Textual Notation document for detailed information.
  • Samples updated
    • All samples have been updated to conform to the new runtime API and textual notation.
  • Documents updated
    • All documents have been updated to conform to the new runtime API and textual notation.
For more information, see the What's New with Release 0.4.0 section in the release notes.


Features and Benefits

  • Simple, easy to learn, C++ source language binding provides unified model for writing multi-core enabled applications.

  • Programming model supports all styles of parallelism so there is no need to re-write or re-compile application in order to change style of parallelism.

  • No knowledge of parallel technologies required to write correct programs that execute in parallel. This means domain experts do not have become parallel experts or learn about threading.

  • Translator tool helps the application developer convert his/her program into C++ classes that define the program for the run-time. The translator also provides code skeletons for the code that the developer needs to write to interface to the run-time.

  • The programming model supports production of a single source that can be used with run-times targeted for different parallel architectures and produce the same results. Thus, there is no need to re-write or re-compile application in order to target a new configuration.

  • Trace option in debug release logs execution sequence of computational steps in order to allow user to verify correct execution flow.

  • Visual Studio* integration allows user to work within familiar programming environment on Windows.



    Video: Overview and use with Microsoft* Visual Studio with Ganesh Rao



    Video: Overview Intel® Concurrent Collections for C/C++ (Part 1, Part 2, Kath Knobe)

Technical Requirements

  1. Intel® Concurrent Collections for C++ is supported on the Microsoft Windows* OS and Linux* OS running on IA-32 or Intel® 64 architecture systems.
  2. For Microsoft Windows*, you must have Microsoft Visual Studio* 2005 SP1 or Microsoft Visual Studio* 2008 with the Visual C++* component installed on your system.
  3. For Linux*, you must have GNU g++ version 3.4.2 or greater installed on your system.
  4. You must have the Intel® Threading Building Blocks 2.1 for Open Source. See Getting Started for details on how to download and install this version.
Note: the Intel® Concurrent Collections for C++ product is designed to run with either the Microsoft* or Intel® C++ compilers on Microsoft Windows*, and either the GNU g++ or Intel® C++ compilers on Linux*.

Online Documentation
The following documents for Intel® Concurrent Collections for C++ online:
Intel® Concurrent Collections: Parallelization of C++ Programs Illustrated with Examples presented by Kath Knobe & Ganesh Rao at IDF 2009 (pdf 2.79 MB)

Frequently Asked Questions

1.     What is Intel® Concurrent Collections for C++?

Intel® Concurrent Collections for C++ is a programming model that allows application domain experts to write multi-core enabled applications without reasoning about threads, the parallel algorithm, or details of scheduling.

2.     What is the derivation of the name “Intel® Concurrent Collections for C++”?

We wanted a name that was both generic and descriptive  of what the product does and one that did not conflict with existing names in the industry.  “Concurrent” obviously alludes to the fact that this is a parallel programming technology.  Wikipedia defines “Collections” as a grouping of some variable number of data items.  It is also a term used in object oriented programming to refer to a set of objects.  Intel® Concurrent Collections for C++ speaks of three types of objects:  steps (computations), tags (controls), and items (data).  So the name seemed both generic and descriptive!

By the way, the original name for the technology, “TStreams”, referred to tagged streams.  The tags allow Intel® Concurrent Collections for C++ to be more powerful than strict streaming models.  You will see the name “TStreams” in several papers that are available on the web, co-authored by one of the developers of the technology who is now an Intel employee, Kath Knobe.

3.     How do I write parallel applications using Intel® Concurrent Collections for C++?

You are an application domain expert.  Imagine you are describing your application to a friend on a whiteboard.  You specify the computational steps and their inputs and outputs in graph form.  Intel® Concurrent Collections for C++  provides a simple textual notation for you to represent your graph.  The textual notation is processed by a Translator which generates a header file that defines the interface to the runtime system.   A “coding hints” file, generated by the Translator, assists the user to correctly specify the interface of the computational steps to the runtime system.

In contrast to the way one would write a serial program, at no time does one specify the execution order of steps, only the inputs and outputs.  The runtime system is responsible for scheduling the execution of the steps when its inputs are available and all the conditions for execution are met.  At no time in this process does the domain expert reason about execution order or scheduling.

For a full discussion see the Concurrent Collections Tutorial (PDF 402KB).

4.     What is in the Intel® Concurrent Collections for C++ package?

Intel® Concurrent Collections for C++ includes the Translator/Hints file generator, user documentation (Getting Started, User’s Guide (PDF 197KB), Runtime API, Textual Notation (PDF 86KB),  Tutorial (PDF 402KB), and Release Notes), runtime library, samples, and Visual Studio* add-ins that support either Visual Studio* 2005 or Visual Studio* 2008.

5.     What benefits can I expect from using Intel® Concurrent Collections for C++?

The Intel® Concurrent Collections for C++ approach allows for a separation of concerns in parallel programming. You as a domain expert only need to focus on the semantics of computations (what the domain expert already knows) without reasoning about the underlying parallel model and the specific target architecture. Porting to a different target platform or interfacing to a different runtime is simplified because the description of the computation remains the same.

6.     What applications are suited for parallelization with Intel® Concurrent Collections for C++?

Every computation step specified in an Intel® Concurrent Collections for C++  graph is assumed able to be executed in parallel with only the constraints imposed by the relations in the graph. When all required inputs to a step are available, the step can start execution and may produce outputs that further drive the execution of other step instances.   Re-targeting the application for a different form of parallel execution requires only linking the compiled application with a runtime appropriate to the hardware configuration.

7.     How do I get started?

You must have the prerequisite software (see Technical Requirements) installed on your system before downloading.  After downloading Intel® Concurrent Collections for C++, you can follow the easy instructions in the Getting Started document in the package and begin building some of the samples to familiarize yourself with application building procedures.  The User’s Guide provides more detailed information as you begin migrating your own serial programs to Intel® Concurrent Collections for C++.

8.     How does Intel® Concurrent Collections for C++ differ from other parallel models such as OpenMP* and Intel® Threading Building Blocks?

OpenMP* is a shared memory parallel programming model consisting of a set of compiler directives and library routines that extend C/C++ (and Fortran). The user uses the language extensions to specify control structures, data environment, synchronization, and scheduling approach for parallel executions.

Intel® TBB is a C++ template library that implements a set of common parallel programming patterns. The user chooses among parallel algorithm templates for parallelizing an application without the need to deal with threads and scheduling policies.

Like OpenMP* and Intel® TBB, Intel® Concurrent Collections for C++ provides a high-level programming abstraction for parallel.  Unlike these other parallel programming approaches, however, Intel® Concurrent Collections for C++ does not require the application domain expert to reason about the parallel algorithm or scheduling.

9.     What is the state of the product on the WhatIf site?

This is a prototype product. Performance and scalability are not yet fully addressed by this prototype release.

10. How do I report problems or send feedback?

Please report problems or send feedback through our Intel® Concurrent Collections for C++ Forum.
You can also post comments to our software engineering blogs.

11. What kind of feedback are you looking for?

We appreciate all feedback, in particular feedback on the following:

  1. Your thoughts on this parallel programming model. How does it compare with other techniques you have used?
  2. Features of Intel® Concurrent Collections for C++. What missing features are important to you?
  3. Syntax of the Intel® Concurrent Collections for C++ textual notation.
  4. Ease of use of the current implementation.  What did you find difficult to use or to understand?
  5. Are there restrictions that present problems for your application domain?
  6. What did you like?

Please visit the Intel® Concurrent Collections for C++ Forum and share your thoughts in an existing discussion thread, or create a new one.

Primary Technical Contacts


Kath Knobe
In the 80’s, Kath Knobe worked for a company that designed compilers for a number of the super (and not quite super) computers of the day. She learned a lot about parallel systems. One of the things she learned was that the state-of-the-art was in serious need of repair. Distinct communities and approaches developed around distinct target architectures: shared or distributed memory; vector, SIMD or VLIW, etc. With a commitment to cleaning up our understanding of parallelism, she went back to school and got her PhD at MIT. A few projects on the way toward her goal include these. “Data Optimization” addresses locality. “Array SSA Form” is an analyzable static single assignment form. “The Subspace Model” and “Weak Dynamic Single Assignment form” are dynamic single assignment forms. “Stampede” is support for streaming media. Concurrent Collections (formerly TStreams) is a parallel programming model developed from this experience.

Geoff Lowney
Geoff Lowney is an Intel Fellow, Software and Solutions Group, Pathfinding and Innovation Division, and Director of Compiler and Architecture Advanced Development. He is responsible for using advanced compiler technology to improve the performance and usability of Intel Architecture processor family products.

Shin Lee
Shin Lee is a staff software engineer at Intel, and the project leader of the Intel® Concurrent Collections for C/C++ Team. She has long standing interest in the areas of optimization and parallelization. Prior to joining Intel, she had worked in Wang Labs, Encore Computers, Digital Equipment, and Compaq. She was one of the developers of HPF while in Digital.

Bob Monteleone
Bob Monteleone is the manager of the Intel (R) Concurrent Collection for C++ project. He has 23 years of software development experience, working as a software engineer and a software engineering manager. Throughout his career, he has been focused on compilers and software developer tools. He is interested in empowering developers to exploit the benefits of software parallelism as its importance in the software industry continues to increase.

Stephen Rose
Stephen Rose is a senior software engineer at Intel working on the Intel® Concurrent Collections for C/C++ product, as well as being the project leader for the Fortran and C/C++ libraries. He has been developing runtime libraries and API’s for over 29 years at Wang Labs, Digital Equipment, and Intel. His interest is in providing layered software to enable engineers to concentrate on their area of expertise. In his spare time, Steve coaches a Senior Major Little League baseball team.

Leo Treggiari
Leo Treggiari is a staff software engineer at Intel working on the Intel® Concurrent Collections for C/C++ product. He has developed a wide range of programming tools over his 33 year career at Wang Labs, Digital Equipment, Compaq and Intel. His interest is in providing tools that make software developers more productive.

Mark Hampton
Mark Hampton is a software engineer at Intel working on the Intel® Concurrent Collections for C++ project, as well as other research projects designed to improve the effectiveness of parallel programs.  Prior to joining Intel, he earned his PhD at MIT, where his research focused on exception handling mechanisms for explicitly parallel architectures, compilation techniques for the Scale vector-thread architecture, and instruction sets designed to improve energy-efficiency.

Frank Schlimbach
Frank Schlimbach is a Senior Software Engineer at Intel Germany working on Intel® Concurrent Collections for C++ and other software tools for parallel programming. He is into parallelism since more than 15 years with primary focus on distributed computing. Coming from Pallas, he started at Intel as the project lead for Intel® Trace Analyzer and Collector. His interested in parallelism and Concurrent Collections is driven by the search for an intuitive and manageable programming paradigm resulting in scalable parallel execution.

Nikolay Kurtov
Nikolay Kurtov is an software engineering intern at Intel in Novosibirsk, Russia. He is a student at Novosibirsk State University and is interested in high-level library optimization research. He believes that Intel® Concurrent Collections for C++ is an effective parallelization tool that can be used by mainstream developers who are seeking a clean and simple way to realize blazing performance on multi-core systems.