Данный пост будет первым из, надеюсь, серии небольших очерков о применении различных библиотек распараллеливания вычислений. В качестве прикладной задачи выбрано графическое построение всем хорошо знакомого множества Мандельброта. В качестве библиотеки реализации вычислений в этот раз возьму OpenMP, а для унификации работы с разными оконными подсистемами - GLUT/OpenGL.
Note. PVS-Studio is an add-in module for Visual Studio 2005/2008 (and 2010 soon) that allows you to detect a lot of various errors in 64-bit and parallel OpenMP applications. PVS-Studio is a contemporary interactive static C/C++ code analyzer. By 'interactive', for instance, we mean the capability of warning filtration and suppression without relaunching the analysis.
Particle systems are an ideal candidate for multi-threading in games. Most games have particle systems and their general nature of independent entities lends well to parallelism. However, a naïve approach won’t load balance well on modern architectures. There are two complementary approaches, task-based threading and SSE, which are ideally suited for particle systems and will obtain maximum performance from multi-core processors.
Though I wrote my previous TBB task scheduler blog just a few days after TBB 3.0 Update 4 had been released, I ignored that remarkable event, and instead delved into more than two year old past. So today I’m going to redeem that slight, and talk about a couple of small but quite useful improvements in the TBB scheduler behavior made in the aforementioned update.
Though Intel® Threading Building Blocks 3.0 Update 4 that introduces a concept of Community Preview feature has just been released, my today's blog will be about something that happened quite long time ago. One of the recent posts on the TBB forum attracted my attention to the issue of information rapidly becoming obsolete.
In White Rabbits – part 2, and on the last chart, we ran test 10 to see what happens when we run the test 9 environment (other threads running as part of application thread pool). Test 10 added the use of parallel_invoke to run the 2nd loop and 3rd loop as separate tasks. We saw some improvement in performance.
Some time ago I wrote in the blog about some problems (see note "OpenMP and exceptions") occurring when an exception excesses the boundaries of parallel sections. I have also told you that an exception can be generated by new operator and that is must be caught and processed before it leaves the parallel section. The constructions used for this are rather inconvenient and complicated. Not so long ago I was told that in this case the smartest solution is using new operator which does not generate exceptions.
There is pleasant news for the developers who want to use iterators and OpenMP together in their programs. Not to say that these technologies have been incompatible until recently, but it was impossible to use them so that they could complement each other. Iterators allow you to elegantly arrange item; they are safer and so on. It has been written in many books about the advantages of iterators and we will not enumerate them here.
I came across an interesting thread at RSDN forum where a specific error of rand() function use in OpenMP parallel sections is considered (RSDN forum thread (RU)). I collect various errors which deal with OpenMP technology use so that to implement their troubleshooting in VivaMP static code analyzer in future.
My plan to go parallel this time was thwarted by concerns that I may still have left some serial performance on the table. So I’ll take one more look (OK, well, no more than three). Leading the contenders was Jim Dempsey’s suggestion that accumulating forces instead of accelerations would save some divides. His numbers did not show a dramatic difference but did suggest summing forces to be ever so slightly faster than accumulating accelerations.
- Page 1