Multi-core example with OpenMP slower than single core?

Tony_the_D
Total Points:
50
Registered User
October 13, 2008 2:06 PM PDT
Rate
 
#4 Reply to #2
Quoting - tim18

In the last example posted in this thread, I can't imagine why parallel sections would be used, rather than parallel do, nor why the inner loop would be designated for OpenMP parallel.  If threaded parallelism is required without any thought given to optimization, /Qparallel would be preferable, even though still not often effective.

As to the minimum problem size for effective OpenMP parallel, I have an example which achieves excellent threaded scaling on Core 2 Duo, when the non-threaded version takes only 1 millisecond.   Of course, this is an ideal case; the cache sharing is effective, as are the persistent threads left from a previous parallel region.  The Intel OpenMP run-time does show a reduced overhead, compared with the Microsoft and gnu libraries.

The basic point, that OpenMP parallelism will not have an advantage for a simple inner loop of length 1000, does apply to the posted case.

Thanks for the feedback.  I am new to the world of parallelization (obviously), so I will look at parallel do as well.



Intel Software Network Forums Statistics

8492 users have contributed to 31630 threads and 100778 posts to date.
In the past 24 hours, we have 26 new thread(s) 125 new posts(s), and 161 new user(s).

In the past 3 days, the most popular thread for everyone has been Implicite multithreading ??? The most posts were made to Crash when loading skeleton The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member mdward