This application note was created to help users of WRF version 2.2 and 2.2.1 make use of the Intel Fortran compiler, versions 10.1 and 11. These instructions have been tested on Intel® Core®2 Duo processors and Intel Itanium® 2 processors running Linux and on Intel® Core®2 Duo processors running Mac OS* X version 10.5.
For WRF 3.0 and later, please review the article Building WRF and WPS with the Intel® Compilers on Linux.
The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. Parallel implementations of WRF include support for OpenMP* and for MPI*. For further information, see http://wrf-model.org/index.php†
Obtaining the Source Code
WRF is in the public domain and source code may be obtained from the WRF project at the URL above. The version with the ARW solver, discussed here, may be downloaded from http://www.mmm.ucar.edu/wrf/users/downloads.html†, as well as test data.
- Intel® Fortran Compiler for Linux (version 10.1 or 11.0)
- Intel® C++ Compiler for Linux (version 10.1 or 11.0)
- An MPI library, such as Intel MPI, if intending to build a distributed memory version.
- See the WRF website for a full list of prerequisite libraries.
Configuration and Set-up Information
- Set up the Intel® compiler environment, e.g., by the bash shell command source ifortvars.sh from the compiler bin directory. Also source iccvars.sh if using the Intel C++ compiler. For the 11.0 compiler only, these scripts require an argument intel64 or ia32.
- If using Intel® MPI, set up the environment with source mpivars.sh from the MPI bin directory (bin64 directory for Intel 64).
- Untar the WRF download and configure and build WRF according to the README file:
- Run ./configure for WRF and select an option that includes ifort.
- Modify the configure.wrf file as necessary. We recommend:
- Replace " -mp" by " -fp-model precise"
- For Intel Itanium-based processors running Linux, set FCOPTIM = -O3 –fno-alias -ip
- For Intel® 64 or IA-32 processors running Linux or Mac OS X, set: FCOPTIM = -O3 –xT –fno-alias -ip for Intel® Core® 2 Duo processors, FCOPTIM = -O3 –xP –fno-alias -ip for any Intel® processor with at least SSE3 support. For the 11.0 compiler, -xssse3 is equivalent to -xT and -xsse3 is equivalent to -xP.
- For Intel® 64 or IA-32 processors running Linux, set: FCOPTIM = -O3 –xW –fno-alias -ip for any processor with at least SSE2 support. For the 11.0 compiler, -msse2 is equivalent to –xW and is the default setting for Intel® 64 or IA-32 processors running Linux.
- If ARCHFLAGS contains the definition –DIFORT_KLUDGE, remove it.
- Ensure that the base options include –convert big-endian and –align all
- Verify NETCDFPATH
- Verify path for MPI if used.
- Make any additional changes indicated in the "Known Issues" section.
With these options, it should be possible to build all source files with full optimization. However, if desired, certain files, such as module_dm, may be built with FCBASEOPTS and OMP but without FCOPTIM, in order to reduce compilation time and memory requirement.
The -O3 option is available for both Intel® and non-Intel microprocessors but it may result in more optimizations for Intel microprocessors than for non-Intel microprocessors. For more information on processor-specific optimizations, see Intel® compiler options for SSE generation and processor-specific optimizations.
Choose one of the WRF test examples, downloading any required data, and build the test, preserving the output, for example:
./compile em_real > build.log 2>&1
The following example runs the real data test, real.exe. The "real" data test requires data downloaded from the WRF web site, the "ideal" tests do not. Untar the real data files and run the initialization code to generate WRF input files. Be sure to increase the shell stack limit (bash shell).
cd test/em_real tar –xzvf jan00_wps.tar.gz ./real.exe ulimit –s unlimited
Run the main simulation (without Intel® MPI):
Run the main simulation (with Intel® MPI):
mpiexec –n <number of procs> ./wrf.exe
See the README_TEST_CASES file for the ideal test cases. The utility <install dir>/external/io_netcdf/diffwrf may be used to compare an output file, such as <install dir>/test/em_real/wrfout_d01_2000-01-24_12:00:00, to a reference version. See the WRF website for further details.
Known Issues and Limitations
On IA-64: "fortcom: Warning: Optimization suppressed due to excessive resource requirements"
For certain WRF configurations, typically involving RSL, the compiler may scale back optimizations to limit the memory requirement and compile time. If you are building WRF on a system with plenty of memory, say 8 GB, you may use the option –override-limits to ask the compiler to continue the compilation without reducing the optimization level. When building with OpenMP, the file solve_em.f90 should be compiled with –override-limits, whatever the optimization level.
On Intel 64: "Fatal compilation error: Out of memory asking for ……."
The compiler for Intel 64 is a 32 bit executable and can access a maximum of 4 GB of memory. For certain WRF configurations, typically involving RSL, the compiler may exhaust the available memory for one or two files when compiled with maximum optimization. This may occur for additional files if less than 4GB total (physical + virtual) memory is available, or on IA-32. In version 10.1 of the compiler only, the internal switch –switch fe_use_rtl_copy_arg_inout may be used to reduce the memory requirement. If warning messages such as "An internal threshold was exceeded" are seen, the additional option –mP2OPT_vec_xform_level=103 may be used to preserve optimization levels.
The version 11.0 compiler for Intel 64 is a native 64 bit executable and is not subject to the 4GB limit on address space. The compiler may still exceed internal limits that are intended to limit memory usage and reduce compile time; this may or may not be accompanied by a warning message. These limits may be avoided by the switch -override-limits in the 11.0 and later compilers. It is strongly recommended to compile the file solve_em.f90 using the switch –override-limits. On a system with plenty of memory, -override-limits may be included in FCOPTIM.
Large memory use or very long compile times for module_configure.f90 may indicate that the definition –DIFORT_KLUDGE has not been removed from the configuration file as described above.