English | 中文 | Русский | Français
2,595 Posts served
8,341 Conversations started
On the way to composing a first thread-parallel version of n-body code, Robert points out parallelization has already been occuring, using the Intel compiler and its vectorization of simple loops.
Forced to revisit the question of accumulating forces one more time, Robert tests addForce(i,j) and discovers that while accelerations are a little faster, it's not much and a much more complicated story than he realized.
Having found the function that consumes the most time, this episode shows the process of drilling down into the hot source and optimizing it BEFORE going parallel.
Robert finds the hot function in the serial n-bodies code, but only after discovering what a good job of function inlining the Intel C++ Compiler does.
Wherein Robert attempts to compile his program and remembers eventually to switch to the Intel C++ compiler to accommodate C++0x features used by the program.
Putting together the function to apply accelerations between a pair of gravitational bodies.
Robert finally deals with the eternal question, forces or accelerations? Which is it more efficient to accumulate?
Fleshing out how to interact between pairs of bodies.
Simulation technology has become quite complex in order to mimic the real world but it starts with some basic principles. Dynamic simulation using a fixed time step is a simple place to start. Last time I took a first stab at a data structure; this time I'll make use of it. for [...]
In considering a parallel solution to the n-body gravitational problem, it's important to carefully design your data.
I could dwell on the best laid plans but I’m starting to sound like a broken record, so rather than wasting any more time, let’s get on with it. We have n bodies we want to manage. Gravitation and the laws of motion give us some basic tools for approaching the solution but there are [...]
My vacation weeks have come and gone, plus a few intense weeks of playing catch-up on the work that accumulated while I was away. Finally though, I have a chance to deliver on a promise made a month ago to share some details on parallelizing the n-body gravitational problem. I've produced a series of refined versions of [...]
With the growing interest in parallel code, a quest has arisen for parallel algorithms and parallelizable algorithms to feed the beast of multi-core. This post marks the start of a series exploring at least the rudiments of the n-body gravitational problem and a recursive parallel algorithm from Matteo Frigo that orders the chaos.
Last time I laid out my plan for studying the partitioners. All that was left was to run the tests, collect and organize the data. Must have been a busy several months. It all went by in a blur. So here I sit, snowed in on my Christmas break and I finally have a chance [...]
I thought my tools were in order to dig inside the TBB task scheduler, but rethought the approach to come up with a new and improved way to look at my nested parallel_for concurrency by seeing how task stealing partitions the work
Wow, time flies around here. I was thinking it’s been a while since I last looked at my task scheduler experiments: a little vacation here, teaching at the O’Reilly Open Source convention there, and a little customer work stuffed around the edges can put a project completely out of mind. So let me recall and [...]
On a quest to understand the TBB scheduler and how it might be used to schedule tasks with order dependencies (i.e., a place where you’d like to block access to an object until it can get built), I’ve been building up tools to take a peek. Last time I showed a technique to use thread [...]
Little did I suspect as I was introducing the topic of blocking in parallel computation in my last post that it would generate such interest, even though it seemed a common problem I’d been working on privately with several Intel customers. Charles Leiserson amplified the pitfalls of employing blocking in a multi-threaded architecture and offered [...]
I’m back with another challenge, encountered during my support work for Intel® Threading Building Blocks. I’ve been working with several TBB users who appreciate the general philosophy of Cilk task scheduling embodied in TBB but have run into some practical challenges applying it to their applications. Often the issue revolves around the need to block [...]
It’s been a while since I’ve had a chance to play, but I’ve been curious what I might discover by applying Intel® Thread Profiler with user events to other TBB structures besides pipelines. I tried checking out Conway’s Game of Life but the source download page wouldn’t reveal its sourcey goodness, so I moved onto [...]
Concluding the series on I/O pipeline flow, we show an additional parameter change and the analysis that led to it, resulting in a near linear scaling of this test code out to 8 processors.
Last time I showed what happened when I tried to run a parallel I/O prototype using the TBB pipeline construct: Since the green patches indicate doing work and the green stripy ones are just spinning, this represents very poor parallel performance. Using the regular facilities of Intel® Thread Profiler, I can zero in on one of [...]
I've been experimenting with TBB pipelines as a means to overlap I/O and processing on a multi-core system, trying to understand how they work. Having a copy of Intel® Thread Profiler has been a wonderful aid to understand what is going on, but there's some tricks I learned to use that add even more power [...]
Whew! I just finished teaching a two-day class in parallel programming on Linux which was also a first attempt to use dual-core laptops with simultaneous Linux and Windows partitions operating under a virtual monitor. Lots of challenges, things that almost worked or just didn't. But the students were great and we had a good time. And I managed [...]
Thanks, Kevin, for leading the way in this budding Threading Building Blocks community with your prolific posts. Now that the responsibilities of OSCON have passed, I hope to follow in his example to explore and share my discoveries about Threading Building Blocks. I work as a technical consulting engineer at Intel, one of the people [...]