Hello there,

I'm working on a medical simulator where we have located a bottleneck that should be fairly easy to fix since it's a straight-forward recursive routine.

The located bottleneck is at line 9 in the code section below : if (!n1.m_bound.overlap(n2.m_bound))

Naturally I've already tried a stack based while loop instead of recursion but that actually ended up slowing down the execution so I went back to my original recursive implementation which in essence is the code I've pasted below.

The top-level pseudo-code calls to the collision routine are basically:

collide(object_a_root_index, object_a_root_nodes, object_b_root_index, object_b_root_nodes, resulting_ab_overlaps);

I have a couple simple questions before I begin to parallelize this code.

1) Would you suggest me use the fibonacci example as a starting point to parallelize this code or do you have more suitable example(s) for this problem?

2) In the fibonacci example you mention that tuning the cutOff parameter "requires some experimentation". I assume that you basically tune this parameter by hand and then reprofile the code until it's optimal for a specific target hardware, right?

3) I'm going to assume that you'll recommend using tbb::concurrent_vector instead of using the concurrency::concurrent_vector. Besides portability issues, what are the major differences between your implementation and the one provided in Parallel Patterns Library from Micrsoft?

void collide(const int i1, const std::vector<BVHNode> &nodes1, const int i2, const std::vector<BVHNode> &nodes2, std::vector<std::pair<int,int>> &result) { const BVHNode &n1 = nodes1[i1]; const BVHNode &n2 = nodes2[i2]; //++num_overlap_tests; if (!n1.m_bound.overlap(n2.m_bound)) return; if (n1.is_leaf_node()) { if (n2.is_leaf_node()) result.emplace_back(n1.m_primitive_idx, n2.m_primitive_idx); else { if (n2.m_left_idx >= 0) collide(i1, nodes1, n2.m_left_idx, nodes2, result); if (n2.m_right_idx >= 0) collide(i1, nodes1, n2.m_right_idx, nodes2, result); } } else { if (n1.m_left_idx >= 0) collide(n1.m_left_idx, nodes1, i2, nodes2, result); if (n1.m_right_idx >= 0) collide(n1.m_right_idx, nodes1, i2, nodes2, result); } }

Thanks in advance,

David Löfstrand