Threading Building Blocks Convex Hull Code Walk, Part 1

In my last post I talked about the convex hull problem and some of the algorithms researchers have invented to solve it. In this post, I begin looking at the solution that comes with the Threading Building Blocks example problems.

The TBB convex_hull_sample.cpp program (found in the examples/parallel_reduce subdirectory) applies the QuickHull algorithm to solve convex hull problems. But my focus in this post is more on how various TBB components were applied in the convex_hull_sample.cpp program than on the relationship between the code and the QuickHull algorithm.

TBB components applied in convex_hull_sample.cpp

The convex_hull_sample.cpp program applies the following TBB components:

    • parallel_for

    • parallel_reduce

    • concurrent_vector

    • blocked_range

    • task_scheduler_init



It was in fact a grep for "concurrent_vector" that first led me to the convex hull example. What's nice about this particular example is that it can be used as a tutorial on how several different TBB components can be applied together to solve a problem. Most of the other TBB code examples focus on a single TBB component.

I've put the code online if you'd like to have it open in a window for viewing as you continue reading this post: convex_hull_sample.cpp and convex_hull.h

convex_hull_sample.cpp code structure

The convex_hull_sample.cpp program performs its analysis in a loop where the number of threads is incremented. Here's main():

int main(int argc, char* argv[]) {
util::ParseInputArgs(argc, argv);

pointVec_t points;
pointVec_t hull;
size_t nthreads;
util::my_time_t tm_init, tm_start, tm_end;

std::cout << " Starting TBB-bufferred version of QUICK HULL algorithm" << std::endl;

for(nthreads=cfg::NUM_THREADS_START; nthreads<=cfg::NUM_THREADS_END;
++nthreads) {
tbb::task_scheduler_init init(nthreads);
tm_init = util::gettime();
initialize_buf(points);
tm_start = util::gettime();
quickhull_buf(points, hull);
tm_end = util::gettime();

util::WriteResults(nthreads, util::time_diff(tm_init, tm_start),
util::time_diff(tm_start, tm_end));
}

return 0;
}


You'll have to look at convex_hull.h to find the references for the util namespace. There, you'll see (in function ParseInputArgs()) that at the command line you can enter four parameters to configure your run:

    • NP - number of points (default = 5000000)

    • SNT - starting number of threads (default = 1)

    • ENT - ending number of threads (default = 8 )

    • -v - turns verbose ON (default = OFF)



So, the command line parameters are parsed, and a loop is launched over the specified (or the default) thread count range. The util::gettime() function (see convex_hull.h) calls the tbb::tick_count() function, to time the processing.

Data set initialization

intialize_buf() applies a parallel_for over FillRNDPointsVector_buf() to initialize the data set upon which the convex hull analysis will be performed. Here's initialize_buf():

void initialize_buf(pointVec_t &points) {
points.clear();

tbb::parallel_for(range_t(0, cfg::MAXPOINTS,
FillRNDPointsVector_buf::grainSize), FillRNDPointsVector_buf(points));
}


This is about as basic an example of use of the parallel_for construct as you're going to find. Interestingly, in this case the developer selected a specific grain size, namely 25000 (see the cfg namespace in convex_hull.h). Perhaps this example was developed prior to the introduction of automated grain size selection into Threading Building Blocks. My experimentation with grain size specification (see "TBB Parallel_for Grain Size Experiments") suggests that using the automated method usually produces an efficient result.

The FillRNDPointsVector_buf class shows us how the data set is initialized:

class FillRNDPointsVector_buf {
pointVec_t &points;
mutable unsigned int rseed;
public:
static const size_t grainSize = cfg::GENERATE_GS;

FillRNDPointsVector_buf(pointVec_t& _points)
: points(_points), rseed(1) {}

FillRNDPointsVector_buf(const FillRNDPointsVector_buf& other)
: points(other.points), rseed(other.rseed+1) {}

void operator()(const range_t& range) const {
const size_t i_end = range.end();
size_t count = 0, j = 0;
point_t tmp_vec[grainSize];
for(size_t i=range.begin(); i!=i_end; ++i) {
tmp_vec[j++] = util::GenerateRNDPoint(count, rseed);
}
appendVector(tmp_vec, j, points);
}
};


The util::GenerateRNDPoint() function preforms the actual generation of each data point. You'll find this in convex_hull.h:

    template
point GenerateRNDPoint(size_t& count, unsigned int& rseed) {
/* generates random points on 2D plane so that the cluster
is somewhat cirle shaped */
const size_t maxsize=500;
T x = random(rseed)*2.0/(double)RAND_MAX - 1;
T y = random(rseed)*2.0/(double)RAND_MAX - 1;
T r = (x*x + y*y);
if(r>1) {
count++;
if(count>10) {
if (random(rseed)/(double)RAND_MAX > 0.5)
x /= r;
if (random(rseed)/(double)RAND_MAX > 0.5)
y /= r;
count = 0;
}
else {
x /= r;
y /= r;
}
}

x = (x+1)*0.5*maxsize;
y = (y+1)*0.5*maxsize;

return point(x,y);
}


The comment tells us that the function "generates random points on 2D plane so that the cluster is somewhat [circle] shaped". A quick perusal of this code shows that it initially generates a grid of points that would fill a rectangular region (x and y are both independently specified using the same random number function), centered on the origen. However, some of the points at a larger distance from the origin (which would be those points in the corners of the rectangular region) are "pulled back" toward the origin in the if(r>1) conditional.

Up next...

This is where I'll stop for today. For one thing, this post is already long. But, this is also a good stopping point from the logic point of view. We've looked at a lot of code. But what does it all mean? To help us visualize the problem, I'm going to do some testing and generate some graphs depicting the kind of point distributions that are generated by this data initialization process. I'll work on that, and have an image or two for you when I write my next post.

Kevin Farnham
O'Reilly Media
TBB Open Source Community

For more complete information about compiler optimizations, see our Optimization Notice.