i have a question on parallel computing... i'm working with a 4*Xeon 550 MHZ computer, debian linux, and i use ifc with OpenMP directives. The program i built is a big loop, so parallelization should be very useful, but the results are not so good. When i compare execution time with and without Openmp directive, i have a speed gain of 80%, which is very lower then the ideal +300% (with 80% gain i mean it takes 100 secs instead of
180). Why? please comment this possible causes (or suggest others!):
- huge thread management overhead
- when i don't use OpenMP directives, more than 1 processor are still used, so i don't compare 4 vs 1 but 4 vs ?
- memory band saturated: in fact the same RAM is shared by all the four processors
thank you in advance