# n-bodies: a parallel TBB solution: parallel code: balanced recursive parallelism with parallel_invoke

Last time, after struggling with different lock configurations to reduce synchronization overhead managing the interactions of n-squared bodies, I changed perspectives on the problem by spatially representing the interactions between all the bodies and (re-)discovering in that view a means to group the interactions so that independent threads could work together without having to worry about locking the data.

This time I’ll turn the concept into some sample code, but first a reminder of that partitioning scheme.

Because the ranges of i and j are disjoint between triangles A and B, I can set two threads loose, one on each, and let them handle the interactions between bodies represented by points in the region.  I can’t touch any of the pairs represented in rectangle C while A and B are being worked, but I can wait until A and B are done, then start C.

So it looks like the algorithm will have two steps, with two different threads of recursion.  Each triangle will be split into two, smaller triangles and an adjacent rectangle; each rectangle will be subdivided into four smaller rectangles.  I can keep splitting until there’s only one point left, but that might prove to be overkill.

```void

body_interact(int i, int j)

{

int d = j - i;

if (d > 1) {

int k = d/2 + i;

parallel_invoke([&]() {body_interact(i,k);}, [&](){body_interact(k,j);});

rect_interact(i, k, k, j);

}

}```

Here’s a function called body_interact(), which takes a pair of indices representing the range of the incoming triangle.  The function finds the midpoint of the defining interval and uses that point, if it is distinct from the endpoints, to define the next pair of triangles and rectangle.  It schedules calls to itself for each of the triangles, and then calls rect_interact() to handle the adjacent rectangle.  Inherent in the behavior of parallel_invoke() is to wait for both (or all) calls to complete before returning, thus guaranteeing that interactions in the rectangle will not start until work is done in the triangles.

This triangle code is but a subset of what is needed for the rectangle recursion.  For it I need two midpoints and two parallel recursion calls:

```void

rect_interact(int i0, int i1, int j0, int j1)

{

int di = i1 - i0; int dj = j1 - j0;

if (di > 1 && dj > 1) {

int im = i0 + di/2;

int jm = j0 + dj/2;

parallel_invoke([&]() {rect_interact(i0, im, j0, jm);},

[&]() {rect_interact(im, i1, jm, j1);});

parallel_invoke([&]() {rect_interact(i0, im, jm, j1);},

[&]() {rect_interact(im, i1, j0, jm);});

}

else {

for (int i = i0; i < i1; ++i)

for (int j = j0; j < j1; ++j)