The auto-parallelization feature of the Intel C++ Compiler automatically translates serial portions of the input program into semantically equivalent multithreaded code. Automatic parallelization determines the loops that are good work sharing candidates, performs the dataflow analysis to verify correct parallel execution, and partitions the data for threaded code generation as is needed in programming with OpenMP directives. The OpenMP and Auto-parallelization applications provide the performance gains from shared memory on multiprocessor systems, IA-32 and Intel 64.
The following table lists the options that enable Auto-parallelization:
This option is useful for loops whose computation work volume cannot be determined at compile-time. The threshold is usually relevant when the loop trip count is unknown at compile-time.
The compiler applies a heuristic that tries to balance the overhead of creating multiple threads versus the amount of work available to be shared amongst the threads.
The n is an integer whose value is the threshold for the auto-parallelization of loops. Possible values are 0 through 100. If n is 0, loops get auto-parallelized always, regardless of computation work volume. If n is 100, loops get auto-parallelized when performance gains are predicted based on the compiler analysis data. Loops get auto-parallelized only if profitable parallel execution is almost certain. The intermediate 1 to 99 values represent the percentage probability for profitable speed-up. For example, n=50 directs the compiler to parallelize only if there is a 50% probability of the code speeding up if executed in parallel.
Also, to be "100%" sure that a loop will benefit from parallelization, the compiler needs to know the iteration count at compile time. For a "99%" or lower threshold, knowing the iteration count at compile time is not a requirement.
This leads to a big difference in the number of loops parallelized at 99% compared to 100%. For many apps, 99% is a better setting, but for some apps with a lot of short loops, 99% will slow them down.
The following example, int_sin.c, does not auto parallelize when we use /Qpar-threshold:100 using command line below :
C: >icl -c /Qparallel /Qpar-report3 /Qpar-threshold:100 int_sin.c
Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.