This article describes a method to compile and run a distributed memory coarray program using Intel® Parallel Studio XE Cluster Edition. An example using Linux* is presented.
For shared memory application: Intel® Fortran Compiler XE 15.0 or newer.
For distributed memory application: Intel® Parallel Studio XE 2015 Cluster Edition for Linux or newer
To compile for distributed memory coarrays, use compiler option -coarrays=distributed (Linux* OS) or /Qcoarrays:distributed (Windows* OS). This requires an Intel® Cluster Toolkit license.
To compile for shared memory coarrays, use compiler option -coarrays=shared (Linux* OS) or /Qcoarrays:shared (Windows* OS). Compiling for shared memory coarrays does not require an Intel® Cluster Toolkit license.
Obtaining Example Source Code
The coarray example from the Composer XE 'coarray_samples' directory is available.
An Intel® Parallel Studio XE Cluster Edition is required for compilation, and the Intel® MPI Library must be installed on the cluster nodes.
Configuration Set Up
A key for running a distributed memory coarray program with process pinning on specific nodes is to build with the compiler option -coarray-config-file=filename (Linux* OS)or /Qcoarray-config-file:filename (Windows* OS). This enables you to take full advantage of Intel® MPI Library features in the coarrays environment, in the same way that 'mpiexec -config filename' allows mpiexec to take its commands from 'filename'.
The contents of the configuration file for this example:
-host host1 -env I_MPI_PIN_PROCESSOR_LIST 0,2,4 -n 3 <path to executable>/coarry_dist_host.x : -host host2 -env I_MPI_PIN_PROCESSOR_LIST 1,3,5 -n 3 <path to executable>/coarry_dist_host.x
This says to execute six coarray images of executable 'coarry_dist_host.x' on nodes host1 and host2, using processors 0,2,4 on host1, and processors 1,3,5 on host2. The I_MPI_PIN_PROCESSOR_LIST environment variable is used to achieve the process pinning on the indicated nodes.
Source Code Changes
See Verifying Correctness
Building the Application
Compile for distributed coarrays, create one coarray image, and specify the coarray configuration file:
ifort -coarray=distributed -coarray-num-images=1 -coarray-config-file=coarray_config.txt coarry_dist_host.f90 -o coarry_dist_host.x
Running the Application
Simply specify the name of the executable
> <path to executable>/coarry_dist_host.x Hello from image 1 out of 6 total images, and running on host: host1 Hello from image 2 out of 6 total images, and running on host: host1 Hello from image 3 out of 6 total images, and running on host: host1 Hello from image 5 out of 6 total images, and running on host: host2 Hello from image 4 out of 6 total images, and running on host: host2 Hello from image 6 out of 6 total images, and running on host: host2 >
Embed call hostnm(hostname) in your coarray program, then print 'hostname' to verify the images are executed on the correct nodes/processors.
This method enables coarray image pinning on specific nodes/node processors. Better load balance across cluster nodes might be obtained, or a subset of nodes easily partitioned.
Known Issues or Limitations
Distributed memory coarrays only work with Intel® MPI; other implementations of MPI are not supported.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804