Multi-core example with OpenMP slower than single core?

Tony_the_D
Total Points:
50
Registered User
October 13, 2008 2:06 PM PDT
Rate
 
#4 Reply to #2
Quoting - tim18

In the last example posted in this thread, I can't imagine why parallel sections would be used, rather than parallel do, nor why the inner loop would be designated for OpenMP parallel.  If threaded parallelism is required without any thought given to optimization, /Qparallel would be preferable, even though still not often effective.

As to the minimum problem size for effective OpenMP parallel, I have an example which achieves excellent threaded scaling on Core 2 Duo, when the non-threaded version takes only 1 millisecond.   Of course, this is an ideal case; the cache sharing is effective, as are the persistent threads left from a previous parallel region.  The Intel OpenMP run-time does show a reduced overhead, compared with the Microsoft and gnu libraries.

The basic point, that OpenMP parallelism will not have an advantage for a simple inner loop of length 1000, does apply to the posted case.

Thanks for the feedback.  I am new to the world of parallelization (obviously), so I will look at parallel do as well.



Intel Software Network Forums Statistics

8292 users have contributed to 31239 threads and 99117 posts to date.
In the past 24 hours, we have 10 new thread(s) 11 new posts(s), and 19 new user(s).

In the past 3 days, the most popular thread for everyone has been huge pages on linux? The most posts were made to Pipeline buffer between stages? The post with the most views is Very amusing...  Escalated as

Please welcome our newest member amirsam7