I have implemented a multithread option in my code. It runs and gives the right answer, but it is many times slower than the single-thread version. How much overhead is wasted in creating and exiting from threads? That seems to take quite a while. Are there any references I can use for guidance?
For more complete information about compiler optimizations, see our Optimization Notice.