ILP64 model: using MPI_IN_PLACE in MPI_REDUCE seems to yield wrong results

ILP64 model: using MPI_IN_PLACE in MPI_REDUCE seems to yield wrong results

hi,

i am using the ifort compiler v. 13.0.1 20121010 together with Intel MPI v.4.1.0.024 on an x86_64 Linux cluster. Using 64-bit integers as default (ILP64 model) in my little Fortran program i obtain wrong results when i use MPI_IN_PLACE in MPI_REDUCE calls (both for integer and real(8)):

my code is as follows:

program test

	include "mpif.h"

	! use mpi

	integer :: iraboof

	integer :: mytid, numnod, ierr

	real(8) :: rraboof
	mytid = 0

	! initialize MPI environment

	call mpi_init(ierr)

	call mpi_comm_rank(mpi_comm_world, mytid,ierr)

	call mpi_comm_size(mpi_comm_world, numnod,ierr)
	iraboof = 1

	if (mytid == 0) then

	call mpi_reduce(MPI_IN_PLACE, iraboof, 1, mpi_integer, mpi_sum, 0, mpi_comm_world, ierr)

	else

	call mpi_reduce(iraboof, 0 , 1, mpi_integer, mpi_sum, 0, mpi_comm_world, ierr)

	end if

	if (mytid == 0) then

	print *, 'raboof mpi reduce', iraboof, numnod

	end if

	rraboof = 1.0d0

	if (mytid == 0) then

	call mpi_reduce(MPI_IN_PLACE, rraboof, 1, mpi_real8 , mpi_sum, 0, mpi_comm_world, ierr)

	else

	call mpi_reduce(rraboof, 0 , 1, mpi_real8 , mpi_sum, 0, mpi_comm_world, ierr)

	end if

	if (mytid == 0) then

	print *, 'raboof mpi reduce', rraboof, numnod

	end if

	call mpi_finalize(ierr)

	end program
 

Compilation is done with

mpiifort -O3 -i8 impi.F90

It compiles and links fine

ldd ./a.out
linux-vdso.so.1 => (0x00007ffff7893000)

	libdl.so.2 => /lib64/libdl.so.2 (0x0000003357c00000)

	libmpi_ilp64.so.4 => /global/apps/intel/2013.1/impi/4.1.0.024/intel64/lib/libmpi_ilp64.so.4 (0x00002ad1a4a3f000)

	libmpi.so.4 => /global/apps/intel/2013.1/impi/4.1.0.024/intel64/lib/libmpi.so.4 (0x00002ad1a4c69000)

	libmpigf.so.4 => /global/apps/intel/2013.1/impi/4.1.0.024/intel64/lib/libmpigf.so.4 (0x00002ad1a528e000)

	librt.so.1 => /lib64/librt.so.1 (0x0000003358800000)

	libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003358000000)

	libm.so.6 => /lib64/libm.so.6 (0x0000003357800000)

	libc.so.6 => /lib64/libc.so.6 (0x0000003357400000)

	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003359c00000)

	/lib64/ld-linux-x86-64.so.2 (0x0000003357000000)

Running the program I however obtain

mpirun -np 4 ./a.out

	raboof mpi reduce 3 4

	raboof mpi reduce 3.00000000000000 4

whereas it should produce

mpirun -np 4 ./a.out 

	raboof mpi reduce 4 4

	raboof mpi reduce 4.00000000000000 4

which is what I also obtain with other MPI libraries.

I would appreciate any comment/help. 

with best regards,

stefan

p.s.: when i use the F90-interface ("use mpi") i obtain the following warnings at compile time:

mpiifort -O3 -i8 impi.F90

	impi.F90(9): warning #6075: The data type of the actual argument does not match the definition. [IERR]

	call mpi_init(ierr)

	-----------------^

	impi.F90(10): warning #6075: The data type of the actual argument does not match the definition. [MYTID]

	call mpi_comm_rank(mpi_comm_world, mytid,ierr)

	--------------------------------------^

	impi.F90(10): warning #6075: The data type of the actual argument does not match the definition. [IERR]

	call mpi_comm_rank(mpi_comm_world, mytid,ierr)

	--------------------------------------------^

	impi.F90(11): warning #6075: The data type of the actual argument does not match the definition. [NUMNOD]

	call mpi_comm_size(mpi_comm_world, numnod,ierr)

	--------------------------------------^

	impi.F90(11): warning #6075: The data type of the actual argument does not match the definition. [IERR]

	call mpi_comm_size(mpi_comm_world, numnod,ierr)

	---------------------------------------------^

and a crash at runtime

mpirun -np 4 ./a.out

	Fatal error in PMPI_Reduce: Invalid buffer pointer, error stack:

	PMPI_Reduce(1894): MPI_Reduce(sbuf=MPI_IN_PLACE, rbuf=0x693828, count=1, MPI_INTEGER, MPI_SUM, root=0, MPI_COMM_WORLD) failed

	PMPI_Reduce(1823): sendbuf cannot be MPI_IN_PLACE

	Fatal error in PMPI_Reduce: Invalid buffer pointer, error stack:

	PMPI_Reduce(1894): MPI_Reduce(sbuf=MPI_IN_PLACE, rbuf=0x693828, count=1, MPI_INTEGER, MPI_SUM, root=0, MPI_COMM_WORLD) failed

	PMPI_Reduce(1823): sendbuf cannot be MPI_IN_PLACE

	Fatal error in PMPI_Reduce: Invalid buffer pointer, error stack:

	PMPI_Reduce(1894): MPI_Reduce(sbuf=MPI_IN_PLACE, rbuf=0x693828, count=1, MPI_INTEGER, MPI_SUM, root=0, MPI_COMM_WORLD) failed

	PMPI_Reduce(1823): sendbuf cannot be MPI_IN_PLACE

13 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Your ldd result showing that you linked against the gfortran compatible library looks like a problem.  This shouldn't happen if you use mpiifort consistently.  The gfortran and ifort libraries can't coexist. Adding -# to the mpiifort command should give a lot more detail about what goes into the script which will pass over to ld.

dear Tim,

thanks for your immediate reply. please find below the output for compiling my program (the one above in the file impi.F90) with your suggested flag:

mpiifort -i8 -# imi.F90

this compilation yields:

mpiifort -i8 -# impi.F90 
/global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/bin/intel64/fpp 
 -D__INTEL_COMPILER=1300 
 -D__unix__ 
 -D__unix 
 -D__linux__ 
 -D__linux 
 -D__gnu_linux__ 
 -Dunix 
 -Dlinux 
 -D__ELF__ 
 -D__x86_64 
 -D__x86_64__ 
 -D_MT 
 -D__INTEL_COMPILER_BUILD_DATE=20121010 
 -D__INTEL_OFFLOAD 
 -D__i686 
 -D__i686__ 
 -D__pentiumpro 
 -D__pentiumpro__ 
 -D__pentium4 
 -D__pentium4__ 
 -D__tune_pentium4__ 
 -D__SSE2__ 
 -D__SSE__ 
 -D__MMX__ 
 -I. 
 -I/global/apps/intel/2013.1/impi/4.1.0.024/intel64/include 
 -I/global/apps/intel/2013.1/impi/4.1.0.024/intel64/include 
 -I/global/apps/intel/2013.1/mkl/include 
 -I/global/apps/intel/2013.1/tbb/include 
 -I/global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/compiler/include/intel64 
 -I/global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/compiler/include 
 -I/usr/local/include 
 -I/usr/lib/gcc/x86_64-redhat-linux/4.4.7/include 
 -I/usr/include 
 -4Ycpp 
 -4Ncvf 
 -f_com=yes 
 impi.F90 
 /tmp/ifortBOT7lB.i90
/global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/bin/intel64/fortcom 
 -D__INTEL_COMPILER=1300 
 -D__unix__ 
 -D__unix 
 -D__linux__ 
 -D__linux 
 -D__gnu_linux__ 
 -Dunix 
 -Dlinux 
 -D__ELF__ 
 -D__x86_64 
 -D__x86_64__ 
 -D_MT 
 -D__INTEL_COMPILER_BUILD_DATE=20121010 
 -D__INTEL_OFFLOAD 
 -D__i686 
 -D__i686__ 
 -D__pentiumpro 
 -D__pentiumpro__ 
 -D__pentium4 
 -D__pentium4__ 
 -D__tune_pentium4__ 
 -D__SSE2__ 
 -D__SSE__ 
 -D__MMX__ 
 -mGLOB_pack_sort_init_list 
 -I. 
 -I/global/apps/intel/2013.1/impi/4.1.0.024/intel64/include 
 -I/global/apps/intel/2013.1/impi/4.1.0.024/intel64/include 
 -I/global/apps/intel/2013.1/mkl/include 
 -I/global/apps/intel/2013.1/tbb/include 
 -I/global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/compiler/include/intel64 
 -I/global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/compiler/include 
 -I/usr/local/include 
 -I/usr/lib/gcc/x86_64-redhat-linux/4.4.7/include 
 -I/usr/include 
 "-integer_size 64" 
 -O2 
 -simd 
 -offload_host 
 -mP1OPT_version=13.0-intel64 
 -mGLOB_diag_file=/tmp/ifort7GVk2e.diag 
 -mGLOB_source_language=GLOB_SOURCE_LANGUAGE_F90 
 -mGLOB_tune_for_fort 
 -mGLOB_use_fort_dope_vector 
 -mP2OPT_static_promotion 
 -mP1OPT_print_version=FALSE 
 -mCG_use_gas_got_workaround=F 
 -mP2OPT_align_option_used=TRUE 
 -mGLOB_gcc_version=447 
 "-mGLOB_options_string=-I/global/apps/intel/2013.1/impi/4.1.0.024/intel64/include -I/global/apps/intel/2013.1/impi/4.1.0.024/intel64/include -ldl -i8 -# -L/global/apps/intel/2013.1/impi/4.1.0.024/intel64/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /global/apps/intel/2013.1/impi/4.1.0.024/intel64/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/4.1 -lmpi_ilp64 -lmpi -lmpigf -lmpigi -lrt -lpthread" 
 -mGLOB_cxx_limited_range=FALSE 
 -mCG_extend_parms=FALSE 
 -mGLOB_compiler_bin_directory=/global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/bin/intel64 
 -mGLOB_as_output_backup_file_name=/tmp/ifortK2gIZoas_.s 
 -mIPOPT_activate 
 -mIPOPT_lite 
 -mGLOB_machine_model=GLOB_MACHINE_MODEL_EFI2 
 -mGLOB_product_id_code=0x22006d91 
 -mCG_bnl_movbe=T 
 -mGLOB_extended_instructions=0x8 
 -mP3OPT_use_mspp_call_convention 
 -mP2OPT_subs_out_of_bound=FALSE 
 -mGLOB_ansi_alias 
 -mPGOPTI_value_profile_use=T 
 -mP2OPT_il0_array_sections=TRUE 
 -mP2OPT_offload_unique_var_string=ifort607026576Zo54LN 
 -mP2OPT_hlo_level=2 
 -mP2OPT_hlo 
 -mP2OPT_hpo_rtt_control=0 
 -mIPOPT_args_in_regs=0 
 -mP2OPT_disam_assume_nonstd_intent_in=FALSE 
 -mGLOB_imf_mapping_library=/global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/bin/intel64/libiml_attr.so 
 -mIPOPT_obj_output_file_name=/tmp/ifort7GVk2e.o 
 -mIPOPT_whole_archive_fixup_file_name=/tmp/ifortwarchNyvxkL 
 "-mGLOB_linker_version=2.20.51.0.2-5.36.el6 20100205" 
 -mGLOB_long_size_64 
 -mGLOB_routine_pointer_size_64 
 -mGLOB_driver_tempfile_name=/tmp/iforttempfilenQtt0t 
 -mP3OPT_asm_target=P3OPT_ASM_TARGET_GAS 
 -mGLOB_async_unwind_tables=TRUE 
 -mGLOB_obj_output_file=/tmp/ifort7GVk2e.o 
 -mGLOB_source_dialect=GLOB_SOURCE_DIALECT_FORTRAN 
 -mP1OPT_source_file_name=impi.F90 
 -mP2OPT_symtab_type_copy=true 
 /tmp/ifortBOT7lB.i90
ld 
 /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../lib64/crt1.o 
 /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../lib64/crti.o 
 /usr/lib/gcc/x86_64-redhat-linux/4.4.7/crtbegin.o 
 --eh-frame-hdr 
 --build-id 
 -dynamic-linker 
 /lib64/ld-linux-x86-64.so.2 
 -L/global/apps/intel/2013.1/impi/4.1.0.024/intel64/lib 
 -o 
 a.out 
 /global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/compiler/lib/intel64/for_main.o 
 -L/global/apps/intel/2013.1/impi/4.1.0.024/intel64/lib 
 -L/global/apps/intel/2013.1/mkl/lib/intel64 
 -L/global/apps/intel/2013.1/tbb/lib/intel64 
 -L/global/apps/intel/2013.1/ipp/lib/intel64 
 -L/global/apps/intel/2013.1/composerxe/lib/intel64 
 -L/global/hds/home/install/intel/2013.1/composer_xe_2013.1.117/compiler/lib/intel64 
 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7/ 
 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../lib64 
 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../lib64/ 
 -L/lib/../lib64 
 -L/lib/../lib64/ 
 -L/usr/lib/../lib64 
 -L/usr/lib/../lib64/ 
 -L/global/apps/intel/2013.1/impi/4.1.0.024/intel64/lib/ 
 -L/global/apps/intel/2013.1/mkl/lib/intel64/ 
 -L/global/apps/intel/2013.1/tbb/lib/intel64/ 
 -L/global/apps/intel/2013.1/ipp/lib/intel64/ 
 -L/global/apps/intel/2013.1/composerxe/lib/intel64/ 
 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../ 
 -L/lib64 
 -L/lib/ 
 -L/usr/lib64 
 -L/usr/lib 
 -ldl 
 /tmp/ifort7GVk2e.o 
 --enable-new-dtags 
 -rpath 
 /global/apps/intel/2013.1/impi/4.1.0.024/intel64/lib 
 -rpath 
 /opt/intel/mpi-rt/4.1 
 -lmpi_ilp64 
 -lmpi 
 -lmpigf 
 -lmpigi 
 -lrt 
 -lpthread 
 -Bstatic 
 -lifport 
 -lifcore 
 -limf 
 -lsvml 
 -Bdynamic 
 -lm 
 -Bstatic 
 -lipgo 
 -lirc 
 -Bdynamic 
 -lpthread 
 -Bstatic 
 -lsvml 
 -Bdynamic 
 -lc 
 -lgcc 
 -lgcc_s 
 -Bstatic 
 -lirc_s 
 -Bdynamic 
 -ldl 
 -lc 
 /usr/lib/gcc/x86_64-redhat-linux/4.4.7/crtend.o 
 /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../lib64/crtn.o
rm /tmp/ifortlibgccyi9h59
rm /tmp/ifortgnudirs06mNow
rm /tmp/ifort7GVk2e.o
rm /tmp/ifortBOT7lB.i90
rm /tmp/ifortakfVFX.c
rm /tmp/ifortdashvdk0IZj
rm /tmp/ifortargC1wikG
rm /tmp/ifortgas65oTE2
rm /tmp/ifortK2gIZoas_.s
rm /tmp/ifortldashv7B4mF7
rm /tmp/iforttempfilenQtt0t
rm /tmp/ifortargvFMClQ
rm /tmp/ifortgnudirsMR2abY
rm /tmp/ifortgnudirsHeROwk
rm /tmp/ifortgnudirsDsnJSG
rm /tmp/ifortldashvJ79Ve3
rm /tmp/ifortgnudirsXiurBp
rm /tmp/ifortgnudirsp3WeYL
rm /tmp/ifortgnudirsmUDkl8
rm /tmp/ifort7GVk2e.o

Hi Stefan,

The problem is not related to gfortran.  The libmpigf.so library is used both for gfortran and the Intel® MPI Library.  I am able to get the same behavior here.  I'll check with the developers, but I'm expecting that MPI_IN_PLACE may not be correctly handled in ILP64.

As a note, the MPI Fortran module is not supported for ILP64 programming in the Intel® MPI Library.  Please see Section 3.5.6 of the Intel® MPI Library Reference Manual for more information on ILP64 support.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

hi James,

thanks for your detailed answer. I am looking forward to hear about the feedback from the developers. a similar part of the MPI-parallelized code above constitutes a central piece in a core functionality of a quantum chemistry program package (called "Dirac") where I am contributing developer. It would be great to know that with one of the next releases IntelMPI with the ILP64 model could then be fully supported. 

with best regards,

stefan

Hi Stefan,

Try compiling and running with -ilp64.

mpiifort -ilp64 -O3 test.f90 -o test

mpirun -ilp64 -n 4 ./test

This works for me.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

hi James,

indeed reduce+MPI_IN_PLACE works with that setup also for me. However, MPI_COMM_SIZE does no longer work:

program test
 include "mpif.h"
 integer :: mytid, numnod, ierr
mytid = 0
 ! initialize MPI environment
 call mpi_init(ierr)
 call mpi_comm_rank(mpi_comm_world, mytid,ierr)
 call mpi_comm_size(mpi_comm_world, numnod,ierr)
print *, 'mytid, numnod ', mytid, numnod
call mpi_finalize(ierr)
end program

Compiling and running the above test program with 

mpiifort -ilp64 -O3 test.F90 
mpirun -ilp64 -np 4 ./a.out 
 mytid, numnod 1 0
 mytid, numnod 0 0
 mytid, numnod 2 0
 mytid, numnod 3 0

yields a "0" for the size of the communicator MPI_COMM_WORLD. 

Any idea what could be wrong?

with best regards,

stefan

Hi Stefan,

So I see.  I am able to get the correct results by compiling and linking with -ilp64, but without -i8, and changing the declaration of numnod to integer*8.  Let me check with the developers and see what we can do about this.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

hi James,

thanks for your feedback, i get exactly the same now as you described above. what i should maybe emphasize is that i was aiming at a working compilation with 64-bit integers as default size (-i8 or -integer-size 64) which somehow implies the ILP64 model as far as i can see. 

What exactly does the

-ilp64
flag set during compilation? obviously, it does not imply 64-bit default integers in the Fortran code as such. does it only enable linking to the ILP64 Intel libraries?

with best regards,

stefan 

Hi Stefan,

Using -ilp64 links to libmpi_ilp64 instead of libmpi.  The correct way to utilize this is to compile with -i8, then link and run with -ilp64.  However, this is not giving correct results either.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

hi James,

thanks for the clarification and your patience. Let's see what the developers can come up with. 

with best regards,

stefan

Hi Stefan,

There are two workarounds for this.  The first is to not use MPI_IN_PLACE in a program with -i8.  The second is to modify mpif.h.  Change

       INTEGER MPI_BOTTOM, MPI_IN_PLACE, MPI_UNWEIGHTED

to

       INTEGER*4 MPI_BOTTOM, MPI_IN_PLACE, MPI_UNWEIGHTED

This works for your test program.  Try it on your

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Stefan,

If you're still watching this, how did the workarounds work for your program?

Leave a Comment

Please sign in to add a comment. Not a member? Join today