This application note was created to help users of WRF version 2.2 and 2.2.1 make use of the Intel Fortran compiler, versions 10 and 11.
For WRF version 3, compare with the other references in /en-us/articles/building-wrf-and-wps-with-the-intel-compilers-on-linux-and-improving-performance-on-intel .
The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. Parallel implementations of WRF include support for OpenMP* and for MPI*. For further information, see http://wrf-model.org/index.php†
Obtaining the Source Code
WRF is in the public domain and source code may be obtained from the WRF project at the URL above. The version with the ARW solver, discussed here, may be downloaded from http://www.mmm.ucar.edu/wrf/users/downloads.html†, as well as test data.
Obtaining the latest version of the Intel® Fortran and C++ Compilers
The latest versions of the Intel® Fortran and C++ compilers may be purchased, or evaluation copies requested, from http://www.intel.com/cd/software/products/asmo-na/eng/compilers/284132.htm.
Existing customers with current support can download the latest compilers directly from https://registrationcenter.intel.com/.
- The Intel® Fortran Compiler for Linux;
- Either the Intel C++ Compiler for Linux or the GNU C++ compiler (gcc), version 3.2 or later.
- The network Common Data Form (netCDF)* library, obtainable from the Unidata site at http://www.unidata.ucar.edu/software/netcdf/† See the KB article Building netCDF with the Intel Compilers for instructions on building netCDF.
- An MPI library, such as MPICH* or Intel MPI, if intending to build a distributed memory version.
These instructions have been tested on Intel® Core®2 Duo processors and Intel Itanium® 2 processors running Linux and on Intel® Core®2 Duo processors running Mac OS* X version 10.5.
These instructions apply to versions 10.1 and 11.0 of the Intel compilers. gcc, if used, must be version 3.2 or later.
Configuration and Set-up Information
1) Set up the Intel® compiler environment, e.g., by the bash shell command "source ifortvars.sh" from the compiler bin directory. Also "source iccvars.sh" if using the Intel C++ compiler. For the 11.0 compiler only, these scripts require an argument "intel64" or "ia32".
2) If using Intel® MPI, set up the environment with "source mpivars.sh" from the MPI bin directory (bin64 directory for Intel 64).
3) Configure (./configure) and build (make –check) netCDF with default options.
(If necessary, see the build instructions at the Unidata netCDF web site.)
4) Set the environment variable NETCDF to point to the top level netCDF directory.
5) Untar the WRF download and configure and build WRF according to the README file:
6) Run ./configure for WRF and select an option that includes ifort.
7) Modify the configure.wrf file as necessary. We recommend:
a) Replace " -mp" by " -fp-model precise"
b) For Intel Itanium-based processors running Linux, set
FCOPTIM = -O3 –fno-alias -ip
c) For Intel® 64 or IA-32 processors running Linux or Mac OS X, set:
FCOPTIM = -O3 –xT –fno-alias -ip for Intel® Core® 2 Duo processors
FCOPTIM = -O3 –xP –fno-alias -ip for any Intel® processor with at least SSE3 support
For the 11.0 compiler, -xssse3 is equivalent to -xT and -xsse3 is equivalent to -xP.
d) For Intel® 64 or IA-32 processors running Linux, set:
FCOPTIM = -O3 –xW –fno-alias -ip for any processor with at least SSE2 support. For the 11.0 compiler, -msse2 is equivalent to –xW and is the default setting for Intel® 64 or IA-32 processors running Linux.
e) If ARCHFLAGS contains the definition –DIFORT_KLUDGE, remove it.
f) Ensure that the base options include –convert big-endian and –align all
g) Verify NETCDFPATH
h) Verify path for MPI if used.
i) Make any additional changes indicated in the "known issues" section.
With these, it should be possible to build all source files with full optimization. However, if desired, certain files, such as module_dm, may be built with FCBASEOPTS and OMP but without FCOPTIM, in order to reduce compilation time and memory requirement.
The -O3 option is available for both Intel® and non-Intel microprocessors but it may result in more optimizations for Intel microprocessors than for non-Intel microprocessors. For more information on processor-specific optimizations, see Intel® compiler options for SSE generation and processor-specific optimizations.
Source Code Changes
Should not be necessary
Choose one of the WRF test examples, downloading any required data, and build the test, preserving the output, e.g., ./compile em_real > build.log 2>&1
Go to the directory for the chosen test:
The "real" data test requires data downloaded from the WRF web site, the "ideal" tests do not.
Untar the data files:
tar –xzvf jan00_wps.tar.gz
Run the initialization code to generate WRF input files:
./real.exe ( or ./ideal.exe )
Increase the shell stack limit:
ulimit –s unlimited (limit stacksize unlimited for C shell)
Run the main simulation:
./wrf.exe or mpirun –n <number of procs> ./wrf.exe to run under MPICH or mpdboot --file=<hostfile>
mpiexec –n <number of procs> ./wrf.exe to run under Intel® MPI.
See the README_TEST_CASES file for the ideal test cases.
The utility <install dir>/external/io_netcdf/diffwrf may be used to compare an output file, such as
< install dir>/test/em_real/wrfout_d01_2000-01-24_12:00:00,
to a reference version. See the WRF website for further details.
Threading with OpenMP
Known Issues and Limitations
On IA-64: "fortcom: Warning: Optimization suppressed due to excessive resource requirements"
For certain WRF configurations, typically involving RSL, the compiler may scale back optimizations to limit the memory requirement and compile time. If you are building WRF on a system with plenty of memory, say 8 GB, you may use the option –override-limits to ask the compiler to continue the compilation without reducing the optimization level. When building with OpenMP, the file solve_em.f90 should be compiled with –override-limits, whatever the optimization level.
On Intel 64: "Fatal compilation error: Out of memory asking for ……."
The compiler for Intel 64 is a 32 bit executable and can access a maximum of 4 GB of memory. For certain WRF configurations, typically involving RSL, the compiler may exhaust the available memory for one or two files when compiled with maximum optimization. This may occur for additional files if less than 4GB total (physical + virtual) memory is available, or on IA-32. In version 10.1 of the compiler only, the internal switch
–switch fe_use_rtl_copy_arg_inout may be used to reduce the memory requirement. If warning messages such as "An internal threshold was exceeded" are seen, the additional option –mP2OPT_vec_xform_level=103 may be used to preserve optimization levels.
The version 11.0 compiler for Intel 64 is a native 64 bit executable and is not subject to the 4GB limit on address space. The compiler may still exceed internal limits that are intended to limit memory usage and reduce compile time; this may or may not be accompanied by a warning message. These limits may be avoided by the switch -override-limits in the 11.0 and later compilers. It is strongly recommended to compile the file solve_em.f90 using the switch –override-limits. On a system with plenty of memory, -override-limits may be included in FCOPTIM.
Large memory use or very long compile times for module_configure.f90 may indicate that the definition
–DIFORT_KLUDGE has not been removed from the configuration file as described above.
Please report any problems building WRF that are not described here to Intel Premier Support at https://premier.intel.com.
† This link will take you off of the Intel Web site. Intel does not control the content of the destination Web Site.