Parallel Programming Talk #61 - Parallel Java with Intel's Paul Guermonprez

Welcome to Show 61 of Parallel Programming Talk.
On this episode Clay and Aaron talk Parallel Java with Intel's Paul Guermonprez.


Download an video of show #61.

Download an MP3 of show #61. (24MB)

First The News:

Univesity of Illinois UC is having a Research Seminar on Jan 28th 2010
"Provable Annotations for Race-Freedom"

Online and at 2405 Siebel Center for Computer Science on UIUC campus
Rajesh Karmani (joint work with P. Madhusudan and Brandon Moore), Department of Computer Science, University of Illinois at Urbana-Champaign

Abstract: Data-races in high-level programming languages almost always indicate an error in the program. The goal of this work is to build a framework to prove race-freedom in parallel programs with the help of programmer-written high-level annotations.

The seminar will discuss the consideration a large class of data-parallel programs that achieve race-freedom by dividing the data amongst threads in intricate ways, and show that natural and simple annotations suffice to prove race-freedom.

Workshop on Parallel Programming Patterns
ParaPLoP 2010: March 30 - April 1 2010 in Carefree, AZ
You are invited to submit a paper for the event.

Listen Question Show February 2, 2010
Send Email to parallelprogrammingtalk@intel.com

We have made some improvements to the Parallel Programming Forum.
Let us know what you think.

On Today's Show:

Talking Parallel Java with Intel's Paul Guermonprez.
Paul is a software developer, working at Intel Paris. He's been working for 8 years in the biotechnology world, developing scientific information systems in Java. Now I'm helping developers to optimize their software on Intel platforms.

1. Threads

First, you can use java like any other decent programming language to write parallel code using low level threads. Threads are kernel objects and java is proposing a complete API : thread creation, synchronization primitives, joins ... to control them. Everything is part of the default java.util.Thread behavior and part of the standard virtual machine.

The coding concept is quite easy : move the method you want to execute in parallel in a thread object, and instantiate multiple thread objects. The problem comes later, when you need to protect shared variables access, but that's the idea.

If you want to go a bit further, java is proposing a java.util.concurrent package to manage thread pools,you also have highly optimized thread safe containers like a BlockingQueue or ConcurrentHashMap.

2. JSR166y

With thread you can do everything but it's a bit complex, you have to fully understand the concepts and technical problems caused by low level parallel programming. That's why higher level libraries like JSR-166y have been invented.

This library is proposing a fork join model to guide parallel programming. You don't have to deal with the technical details :just define a task to be executed in parallel, an object to merge results from 2 tasks, and store your data in what we call a "parallel array". All you have to do next is associate the three together and ask the library to execute it on the hardware you have.

The library is able to adapt dynamically to your resources, and move processing between processor queues with a minimum overhead. Sounds simple but it's not ! Staying at a very high level of abstraction allows the library to perform very smart optimizations.

The style and behavior is functional programming. If you are not yet familiar with functional programming that's a great way to discover it, as this type of programming is the basis for a lot of interesting parallel programming projects and languages. If you read Intel Software Network often, you may know Threading Building Block in C/C++, that's the same concept.

This java library is not part of the default vm but will be in the release 7. The future will be functional programming or won't be at all.

3. HADOOP

All software based on threads share the same address space, that's why they can only take advantage of a single machine. If you want to go further, and use a cluster, you have to use a mix of processes, threads and network messages. Writing threads by hand was complex, but that's nothing compared to having a working and efficient cluster software. That's why frameworks like HADOOP exist.

HADOOP is a project to develop distributed computing solutions from the apache foundation. Same concept as JSR-166y, but on a larger scale. It's focused on data and data treatment, there's no proposed solution for interfaces or logic.

Links for more information

JSR166-y by Doug Lea :
http://gee.cs.oswego.edu/dl/concurrency-interest/

Parallel programming in java :
http://java.sun.com/docs/books/tutorial/essential/concurrency/

Join us next week for another exciting episode of Parallel Programming Talk. On the show we'll be taking listener questions. So let us know what your thinking by sending questions to parallelprogrammingtalk@intel.com

And remember programers, be thread safe.
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.