I have a code where I have several call to omp_set_lock() and omp_unset_lock(). When I compile this code with icc I get a considerable slow down. For instance when run it using more than one thread, the time I get is stable at about the double of running one thread. But when I compile with gcc I get a good (but not great) speedup ut to 16 threads. Comparing the running times using 16 threads the code using icc is about 52 times slower than when using gcc!
Obviously there seems to be some issue about how icc handles locks. Could someone confirm if this is a known issue and if there is any known work around?
I am using gcc version 4.1.2 20080704 and icc version 11.1. The system has a four E7-4850 cpus.