Parallel for with concurrent_hash_map and OpenMP

Parallel for with concurrent_hash_map and OpenMP

I'm using OpenMP with concurrent data structures from Intel TBB. localClustersTable is a concurrent_has_map.

    #pragma omp for
    for(auto& it = localClustersTable.begin(); it<localClustersTable.end(); it++)
    {
        auto& clusters = it->second;
        clusters->ComputeZones();
    }

But it doesn't support "<" operator for its iterator. I can not use "!=" instead of "<" because that won't be compatible with OpenMP. So I can't parallelize the loop using OpenMP because OpenMP needs random access to container. How can I make it work?

14 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

I'm trying to get evey value of a concurrent_hash_map and do some work with it. Can I achieve this via TBB parallel_for construct?

THis may work:

#pragma omp parallel
{
  nThreads = omp_get_num_threads();
  iThread = omp_get_thread_num();
  if(nThreads == 1)
  {
    for(auto& it = localClustersTable.begin(); it<localClustersTable.end(); it++)
    {
      auto& clusters = it->second;
      clusters->ComputeZones();
    }
  }
  else
  {
    int i = 0; // each thread has local i
    auto& it = localClustersTable.begin();
    while(it<localClustersTable.end())
    {
      if((i % nThreads) == iThread)
      {
        auto& clusters = it->second;
        clusters->ComputeZones();
      }
      i++;
      it++;
    }
  }
}

Jim Dempsey

 

www.quickthreadprogramming.com

You have missed the point. As I have said before

But it doesn't support "<" operator for its iterator. I can not use "!=" instead of "<" because that won't be compatible with OpenMP.

You are using same "<" operator for iterator.

I'm not familiar with OpenMP, but that would probably only work with random-access iterators (either directly or indirectly through an index). A concurrent_hash_map::iterator can only navigate forward.

For parallel iteration, use tbb::parallel_for() with tbb::concurrent_hash_map::range().

Raf is correct.  From the OpenMP 4.0 spec:

Note – Random access iterators are required to support random access to elements in constant time. Other iterators are precluded by the restrictions since they can take linear time or offer limited functionality. It is therefore advisable to use tasks to parallelize those cases.

Yes, I'm aware of OpenMP needs random access iterators.

@Raf: Can you please post a sample code using tbb::parallel_for() with tbb::concurrent_hash_map::range().

I don't have example code for that, but it's one of the most basic things you can do in TBB.

Like OpenMP, is there a requirement for TBB to use random access iterator? 

For example, OpenMp doesn't allow the following loop.

for(const auto& pair: localClusterTable)
{

    //Do some work

}

What about TBB, does it allow this loop?

Both OpenMP and Intel TBB should require random access iterators for their parallel_for operations, because a divisible random access iterator is the canonical way to divide the work among a team of threads.  Both have a notion of how many threads are available and use that information to try to divide the work evenly between the available set of threads.  Partitioning the available work this way requires the ability to divide the work without touching every element of it (which can play havok with cache residencies).  

I found it quite surprising to notice only a few days ago that the standard binary-search algorithms, and even binary_search() itself, accept forward iterators, because how much effort do you have to spend in those cases to guarantee a logarithmic number of comparisons and a linear number of steps? That would be 2*N steps: you have to do a full walk first just to get the number of elements to be able to decide which is the middle one (the input is two iterators, not a collection that can tell you its size), and then cumulatively another full walk to find the target. To guarantee N steps all the iterators would have to be copied to a vector first, but that seems like overkill. Right? Unless implementations just ignore all that in the first place?

By the way, I don't understand the notation in "At most log(last-first) + 1 comparisons.": log() is supposed to be the natural logarithm, as opposed to log2() or log10(), but that makes little sense here, and if it's a shortcut for O(log N) then what is the +1 about?

Meanwhile, less is more: the compiler will tell you if the iterator is not random-access, so you'll avoid those N*log(N) iterator steps (1.5 traversals at each level in the tree).

Do tell me if I've made a mistake.

Presumably the iterator of a concurrent_hash_map has a relatively light-weight itr++ which involves no comparrisons. The code I presented should iterate fairly quickly. Now if you could fix the iterator to support itr+=nThreads then the while pick next loop would run somewhat faster.

Making this iterator support [] should be relatively straitforward. This combined with size() would permit omp to slice up the iteration space.

Jim Dempsey

www.quickthreadprogramming.com

Both TBB and OpenMP need random access iterator. But then, why TBB allows "!=" for iterator, but OpenMP doesn't?

Those are separate topics.  Intel TBB and OpenMP need random access to the block of values they are going to execute in parallel because of the need to partition the work among a thread team with a minimum of overhead, not touching all the intermediate values, yada yada yada.

The constraint in OpenMP to use relational operators for the loop condition is specific for OpenMP; loop-constructs in OpenMP ALWAYS operate on compact blocks of values. and it is safer to use a relational operator (<, <=, >, >=) because the address of the last element in the block (or first) is always at an address extreme.  Whereas Intel TBB also incorporates concepts from the C++ Standard Template Library, which has the broadened concept of iterators, which may have a variety of implementations, from Foward to Random-Access.  With some forms of iterators, relational operators such as those just enumerated will generate bad code: the interior elements of a particular list may not be within address ranges between the first element of the list and the last one.  Canonical form for STL iterators is to iterate as long as the iterator is not equal to the end item (the empty marker of the end of a half-open interval).  Therefore, Intel TBB can use  the != operator for testing its termination condition.

The real underlying issue here is the attempt to use OpenMP to parse STL-style iterators: they aren't really compatible with each other.

发表评论

登录添加评论。还不是成员?立即加入