I am running a large set of simulations using a Fortran application. My standard approach is to use mutiple instances of the application to run several jobs at a time and a batch file in each CMD window so the entire set can be processed. The number of instances depends on the number of physical cores, but is usually 3 or 4. I have noticed a large time increase in run time when running 3 versus 1 (about 2.5 times slower), which it is still faster to run 3 at a time than 3 in series. I use the MKL and the functions I am using are by default multithreaded. I have found that setting the environment variable mkl_num_threads=1 helps with the slow done, but only gets it to the 2.5 times slower. I am familiar with multithreading but not how to harness it for good. For what it is worth, I am an engineer not a programmer, so my application is very basic as are my programming ninja skills. So my first question is if I want to keep my process the same (ie multiple instances), are there other ways to help the sims from competing with each other? I played a little with the affinity mask, both as an environment variable and through the task manager, without any luck, but this is probably because this was the first time I messed with it. Any help would be appreciated.
For more complete information about compiler optimizations, see our Optimization Notice.