To add parallelism to the relevant parts of your program:
- Replace annotations with parallel framework code.
- Build a parallel version of your program.
- Run the parallel program.
With your own program, before you add parallel framework code, you should complete developer/architect design and code reviews about the proposed parallel changes.
Add the Parallel Framework Code
In this step, you use the editor to replace Intel Advisor annotations with parallel framework code. For your convenience, you only need to view the source files after you use make to build the executable:
In the source file nqueens_omp.f90 that is built into the 3_nqueens_omp executable file, OpenMP* code has been added. OpenMP is the high-level parallel framework supported by Intel Advisor for Fortran programs.
Build and Examine the Parallel Program
Use the nqueens Fortran sample to build and run one of the parallel programs, which use the OpenMP parallel framework.
In a terminal session, locate the Intel Advisor installation directory root on your system. The default installation location is /opt/intel/advisor_xe_2013.
Type source /opt/intel/advisor_xe_2013/advixe-vars.sh (or equivalent path) to set up your bash shell environment. With a different shell, source the advixe-vars.csh script.
If you need to set up your compiler environment, do so now.
- Verify that the ADVISOR_XE_2013_DIR environment variable is set - for example, type: env | grep ADVISOR_XE_2013_DIR. If it is not set, define it by using an export command: export ADVISOR_XE_2013_DIR=/opt/intel/advisor_xe_2013.
Change directory to the nqueens_fortran/ directory (created in a previous tutorial).
Type make 3_nqueens_omp to build a OpenMP parallel version of the nqueens Fortran sample application.
- You used the OpenMP project, so open the source file nqueens_omp.f90. View the OpenMP
#includefile and related OpenMP code, such as the
!OMP PARALLEL DOline in the
solvesubroutine. Also notice that the lock annotations have been replaced by an
!OMP ATOMIClock for the
nrOfSolutions = nrOfSolutions + 1statement in the
setQueensubroutine. For more information about OpenMP, see Intel Advisor help topics under Adding Parallelism to Your Program or locate the OpenMP documentation in your compiler documentation, such as the Intel® Composer XE documentation directory.
With your own program, while you add parallel framework code, view the Annotation Report window to help you locate the remaining Intel Advisor annotations that need to be replaced with parallel framework code. For help completing this step for your own program:
- Open the Advisor XE Workflow tab.
- Click the button below 5. Add Parallel Framework.
- View the instructions. Click the links to display topics in Intel Advisor help.
Run the Parallel Program
In the same terminal session, change directory (cd) to the nqueens_fortran directory that you created when you extracted the nqueens_Fortran.tgz file.
Run the sample parallel application that you built previously using one of the parallel frameworks. For example, type: ./3_nqueens_omp.
Check for output similar to the following:
-bash-4.1$ ./3_nqueens_omp Usage: 3_nqueens_omp[_debug] boardSize Using default size of 14 Starting OpenMP solver for size 14 with 4 thread(s) Number of solutions: 365596 Calculations took 3477ms. Correct Result!
The displayed execution time to run the parallel program on a 4-core system is significantly less than the time required to run the serial version of the program, which was about 7160 ms. The execution times will be different on your system.
The difference between the serial and parallel execution time depends on multiple factors:
- The number of cores available on your system.
- How much of the original program's execution time was placed within a parallel site(s) and the characteristics of the tasks. Intel Advisor Suitability and Correctness tools use annotations to predict your serial program's parallel behavior, which lets you experiment with your sites and tasks - before you add any parallel code.
- Parallel overhead, type of locks, thread characteristics, and other factors - see Next Steps for the Parallel Program