Open Parallel: Optimizing Web Performance with TBB

Open Parallel is a research and development company that focuses on parallel programming and multicore development. We are a bunch of highly skilled geeks from various backgrounds that work together on problems in parallel programming and software development for multicore and manycore platforms.

At LinuxConf (LCA2010) James Reinders gave a talk about the Threading Building Blocks (TBB) library, a C++ threading library that sets out to make multicore programming more accessible to the average programmer. We took this idea on board and explored the possibilities of opening up this approach to an even wider audience, namely the audience of web application developers working in script languages.

Many websites require a non-trivial amount of per-request processing in the application layer, perhaps to retrieve, consolidate or otherwise manipulate data. Achieving better performance at this level improves response times and the overall user experience. Even when processing time at application level is not critical, parallelizing access to database and web service back-end layers can yield substantial improvements in perceived performance.

This drove our goal of adding TBB support into PHP and Perl, starting with HipHop as the PHP implementation of choice and later on adding Perl support to the game.

HipHop is a PHP to C++ cross compiler that was developed by Facebook to cut down on resource needs and speed up the execution times of their gigantic web infrastructure that was started on a classic PHP/MySQL stack and now has to scale to hundreds of millions of users. The HipHop project is a PHP implementation that is thread safe and already uses TBB for some memory management. We started by extending the existing support and added first only the new *parallel_for* function. Later, we added concurrent data structures and re-implemented our first approach.

What we have now is a robust implementation of *parallel_for* and *parallel_reduce* with the data structures needed to support them. What we learned on the way was both, very enlightening and quite frustrating at times. Our aim to make TBB more widely accessible was reached by getting the language extension into HipHop but we also tried to get it into Zend PHP. This turned out to only work with a language compatibility module that does not provide the full glory we can offer on the HipHop platform. The reason for this is the architecture of the PHP interpreter.

Implementing threading into language interpreters turns out to be very hard. There are two dormant/failed approaches in Perl and every attempt in PHP has failed so far. The core developers on both sides are very much in doubt if it is a path worth going down at all. The problem is global locking and copying/sharing of data structures that are thread local. Our Perl implementation is a starting point that could influence not only the Perl community but other interpreter designers and interpreter developers as well.

In the Perl community we are trying to lobby for a const keyword that would lock a data structure and remove the need to copy it into every thread. The ability to make something immutable is missing in Perl and PHP and this makes the startup cost of any worker thread very expensive. For the Perl library we wrote a lazy clone module that would only clone a data structure if the worker thread really accesses it. That way we only penalize the worker thread for accessing data - we can possibly get around cloning structures at all if they are not accessed within this task.

In our work with the PHP HipHop compiler we also wrote a patch set for WordPress and enhanced WordPress with our new *parallel_for* language extension. This trial brought us instant success in reduced page load times. The patch set for WordPress only replaced some key *foreach* loops with *parallel_for* and was our first real success with the TBB library in PHP. Based on that success we started out to re-implement our initial approach and tidy up our patch set for HipHop to make it more accessible to others.

The Perl project worked towards a Perl module that can be used to get access to TBB functions directly. We also started out to implement the core memory structures and then built on top of those the *parallel_for* functionality. The module we have now is stable enough to demonstrate the gains we can get by using TBB in Perl.

To round the project off we implemented two little tools as real world demo and as working code to look at. The demo is based around the HTML5 geo tag which is present in modern browsers and can be read with a Javascript API. In the HipHop version we use it to read the current Lat/Lon from the accessing browser and then parse the Twitter firehose to find tweets with embedded image URLs.

In the Perl demo we query Flickr and fetch a grid of 4x4 images, cache them locally and then render one big image out of scaled versions of the single images. The demos are running on geopic.me

To sum up our experience with TBB and script languages we know now that threading interpreters buries its very own set of challenges but we were able to get further than others did on the same mission by using TBB. The libraries we produced so far - which are open source and can be found on our github account - will be further developed and maintained.

We will continue working on both platforms to expose the power of multicore CPUs to developers in an approachable way. Along the way we also produced a number of more detailed white papers covering various aspects of the project:

* threads::tbb
* TBB in WordPress
* WordPress on HipHop

Get in touch if you are interested in these projects or have questions about the work we did. There is further information on our website OpenParallel.com

Contact: Nicolas Erdody

For more complete information about compiler optimizations, see our Optimization Notice.

Comments

TBB is new for me but hope it will become more helpful thing. I think it is also new for some other people. But you provided wide information about it.