Resolving problem when building HDF5* with Intel® compiler 14.0


Introduction

To build the latest HDF5* with Intel® compiler 14.0, a segmantation fault occurs when running "make check". This article is to provide a solution in resolving this issue. The information in this article is assuming you already undertand how to build HDF5* with Intel compilers by reading Building HDF5* with Intel® compilers.


Version information

HDF5 1.8.11 
Intel® C++ Compiler  14.0
Intel® Fortran Compiler 14.0


Problem Statement

When build HDF5 1.8.11 with Intel® compiler 14.0, after run "make check", a segmentation fault was caught in Fortran tests:

                        ==========================
                               FORTRAN tests
                        ==========================
 FORTRANLIB_TEST is linked with HDF5 Library version 1.8 release  11

 Mounting test                                                          PASSED
 Reopen test                                                            PASSED
 File open/close test                                                   PASSED
 File free space test                                                   PASSED
 Dataset test                                                           PASSED
 Extendible dataset test                                                PASSED
 Basic dataspace test                                                   PASSED
 Reference to object test                                               PASSED
 Reference to dataset region test                                       PASSED
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source
lt-fortranlib_tes  000000000049E169  Unknown               Unknown  Unknown
lt-fortranlib_tes  000000000049CAE0  Unknown               Unknown  Unknown
lt-fortranlib_tes  0000000000472A62  Unknown               Unknown  Unknown
lt-fortranlib_tes  0000000000458953  Unknown               Unknown  Unknown
lt-fortranlib_tes  000000000042F16B  Unknown               Unknown  Unknown
libpthread.so.0    0000003B8FE0F4C0  Unknown               Unknown  Unknown
libhdf5_fortran.s  00007FAAE945177C  Unknown               Unknown  Unknown
libhdf5_fortran.s  00007FAAE9445CED  Unknown               Unknown  Unknown
lt-fortranlib_tes  000000000042058B  Unknown               Unknown  Unknown
lt-fortranlib_tes  000000000040D76C  Unknown               Unknown  Unknown
lt-fortranlib_tes  000000000040D356  Unknown               Unknown  Unknown
libc.so.6          0000003B8F21EC5D  Unknown               Unknown  Unknown
lt-fortranlib_tes  000000000040D249  Unknown               Unknown  Unknown


Root cause

This is caused by a program error in the fortran source code under "hdf5-1.8.11/fortran/src". In file "H5Sff.f90", line 1315 declares an explicit interface of function "h5sselect_hyperslab_c":

            INTERFACE
              INTEGER FUNCTION h5sselect_hyperslab_c(space_id, operator, &
                               start, count, stride, block)
              USE H5GLOBAL
              !DEC$IF DEFINED(HDF5F90_WINDOWS)
              !DEC$ATTRIBUTES C,reference,decorate,alias:'H5SSELECT_HYPERSLAB_C'::h5sselect_hyperslab_c
              !DEC$ENDIF
              INTEGER(HID_T), INTENT(IN) :: space_id
              INTEGER, INTENT(IN) :: operator
              INTEGER(HSIZE_T), DIMENSION(*), INTENT(IN) :: start
              INTEGER(HSIZE_T), DIMENSION(*), INTENT(IN) :: count
              INTEGER(HSIZE_T), DIMENSION(*), OPTIONAL, INTENT(IN) :: stride
              INTEGER(HSIZE_T), DIMENSION(*), OPTIONAL, INTENT(IN) :: block
              END FUNCTION h5sselect_hyperslab_c
            END INTERFACE

From which, dummy arguments stride and block are declared with "OPTIONAL" attributes. This is inconsistent with the definition of function "h5sselect_hyperslab_c" in "fortran/src/H5Sf.c", where it does not contain the "OPTIOINAL" attribute for those arguments.

According to Fortran 2003 standard, section 12.3.2.1, it says:

"If an explicit specific interface is specified by an interface body or a procedure declaration statement (12.3.2.3) for an external procedure, the characteristics shall be consistent with those specified in the procedure definition, except that the interface may specify a procedure that is not pure if the procedure is defined to be pure."

 


Solution

Modify those two lines: 1326 and 1327 in file "fortran/src/H5Sff.f90" to remove the "OPTIONAL" attributes:

              INTEGER(HSIZE_T), DIMENSION(*), INTENT(IN) :: stride
              INTEGER(HSIZE_T), DIMENSION(*), INTENT(IN) :: block

Run "make" and "make check" again, the test will finish successfully.




For more complete information about compiler optimizations, see our Optimization Notice.

10 comments

Top
Seth S.'s picture

I know this thread is essentially dead but I found a solution to a very similar problem and I thought it might be helpful to someone else.

My build of HDF5-1.8.13 using Intel Compilers 15.0.0 20140723 was failing only the Fortran check tests. The problem I encountered was a memory leak nearly identical to that found by Ögmundur while running the fortranlib_test executable.

The source of the memory leak was a call to h5dread_f_c within the subroutine h5dread_integer_3 in file fortran/src/H5Dff_F03.f90.  The remaining call stack for the error was:

fortran/test/fortranlib_test.f90 - line 130
fortran/test/tH5Sselect.f90 - line 268

I was able to determine that configuring HDF5 with the flag '--enable-fortran2003=no' resulted in a build that passed all of the Fortran check tests.  Configuring it with the flag '--enable-fortran2003=yes' was the cause of the error.

THE SOLUTION for me was to add the compiler flag '-assume nostd_value' to the FCFLAGS environment variable.  This resulted in a successful build and all check tests passed.

Additional notes:
I was able to build HDF5 (same configure options without the '-assume nostd_value' compiler flag) with the gcc and gfortran compilers (version 4.7.2) with no problems.
I am running RHEL 5.11  with Linux kernel 2.6.18-398.
The flags I used to configure HDF5 were '--enable-fortran --enable-fortran2003 --enable-cxx --with-zlib --enable-shared=yes --enable-static=yes --enable-parallel --enable-unsupported --with-szlib --with-mpe'.

Seth
 

Ögmundur Petersson's picture

Hi, Yolanda,

That won't work because f_ptr is a local variable in h5dwrite_integer_2 and not a dummy argument. In the interface of the h5dwrite_f_c function in the same source file, the dummy argument buf which corresponds to the actual argument f_ptr, is passed by value using the VALUE keyword so presumably the pointer is passed as such on the C routine. I tried playing around with the type of the variable buf in the C interface (originally pointer to void) without success.

 

Kind regards,

Ögmundur

Chen, Yuan (Intel)'s picture

Hi, Ogmundur

Could you have a try to add Value attribute to the declaration of f_ptr?

In h5dwrite_integer_2 in H5Dff_F03.f90,

Modify

    TYPE(C_PTR) :: f_ptr
to

   TYPE(C_PTR), VALUE :: f_ptr
 

Recompile and see if you still see the error.

Thank you

Yolanda

Ögmundur Petersson's picture

Hi, Yolanda

Sorry for the late reply, I've been on vacation over Christmas and the New Year. On the Red Hat 5.8 system I have gcc version 4.1.2 (the exact output of gcc --version: gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52)).

I'm afraid that it will be a while before all our systems are upgraded to Red Hat 6 and on the CentOS 6.4 system I can't run the intel compilers as it has no access to the license server. I have tried to compile HDF5 using version 14.0 of the intel compilers on my private laptop running Ubuntu 13.4, however, and run into the same problems.

Thanks for your help,

Ögmundur

Chen, Yuan (Intel)'s picture

Hi, Ogmundur

Sorry for my late reponse. What's your gcc version on your Red Hat 5.8 system?

From the release notes the Red hat 5 is to be deprecated, have you tried to upgrade to Red hat 6 and see if the problem is still there?

Thank you.

 

Ögmundur Petersson's picture

Hi Yolanda,

Thanks for your tip. I've had a look at the tests in a debugger now and when h5dwrite_f is called in line 175 of tHF5.f90 a 2D array data_in containing the data to write is passed to the routine. The call resolves to h5dwrite_integer_2 in H5Dff_F03.f90 where the local name for the array is simply buf. A C pointer f_ptr is created to buf by calling C_LOC and this pointer is passed to the actual C routine nh5dwrite_f_c in line 2316 of H5Df.c which does the work. Within the fortran routine the data in the buffer is still correct, in the C routine the memory location of the buf variable has changed and the contents are all garbled. By manually changing the contents of the pointer within the debugger to the original memory location of the buf variable within the fortran routine h5dwrite_integer_2 the correct values are restored and the test completes succesfully.

On my development system (Red Hat 5.8) the gfortran compiler is not current enough to be able to compile the Fortran 2003 interface of HDF5 but on another system running CentOs 6.4 with gfortran 4.7.2 I was able to compile it and using gfortran all tests complete successfully and in the debugger I can see that the pointer is correctly passed between fortran and C. I'm not sure it is significant, but in the version of the library compiled with ifort the type of f_ptr in the fortran routine is c_ptr (consistent with the fortran source code) whereas in the version compiled using gfortran it is $void (consistent with the C source being called).

I hope this helps but I'm stumped for now.

Best regards and thanks for your help,

Ögmundur

Chen, Yuan (Intel)'s picture

Hi, Ogmundur

​My fortran test works fine with 1.8.12. And the test runs quite fast.

============================
Fortran API:  fortranlib_test  Test Log
============================
                        ==========================
                               FORTRAN tests
                        ==========================
 FORTRANLIB_TEST is linked with HDF5 Library version 1.8 release  12

 Mounting test                                                          PASSED
 Reopen test                                                            PASSED
 File free space test                                                   PASSED
 Dataset test                                                           PASSED
 Extendible dataset test                                                PASSED
 Basic dataspace test                                                   PASSED
 Reference to object test                                               PASSED
 Reference to dataset region test                                       PASSED
 Basic selection test                                                   PASSED
 Hyperslab selection test                                               PASSED
 Element selection test                                                 PASSED
 Element selection functions test                                       PASSED
 Selection combinations test                                            PASSED
 Selection bounds test                                                  PASSED
 Basic datatype test                                                    PASSED
 Compound datatype test                                                 PASSED
 Enum datatype test                                                     PASSED
 Derived float datatype test                                            PASSED
 External dataset test                                                  PASSED
 Dataset chunk cache configuration                                      PASSED
 Attribute test                                                         PASSED
 Identifier test                                                        PASSED
 Filters test                                                           PASSED
 SZIP filter test                                                      --SKIP--
 Group test                                                             PASSED
 Error test                                                             PASSED
 VL test                                                                PASSED

                   ============================================
                    FORTRAN tests completed with    0 error(s) !
                   ============================================

Seems your problem is from mounting test. Suggest you to debug the test and find the root cause:

1. Build the debug version test by:

export CFLAGS= '-g' 
export CXXFLAGS= '-g'

export FCFLAGS= '-g'

and rebuild the library.

2. Start the debugger by libtool:

> libtool --mode=execute idbc fortran/test/fortranlib_test_1_8

I think I will need to write another article on how to do this.

Hope it helps.

Ögmundur Petersson's picture

Hi,

Thanks for your quick response. The problem is that the output created by the test routine isn't very informative. After starting the fortran API tests, a process named lt-fortranlib_t consumes 100% CPU and slowly uses up all of the available memory until I stop it. Looking in the log file fotranlib_test.chklog I see the following:

============================
Fortran API:  fortranlib_test  Test Log
============================
                        ==========================                            
                               FORTRAN tests
                        ==========================                            
 FORTRANLIB_TEST is linked with HDF5 Library version 1.8 release  12
 
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 mounting test error occured
 Mounting test                                                          PASSED
HDF5-DIAG: Error detected in HDF5 (1.8.12) thread 0:
  #000: H5S.c line 405 in H5Sclose(): not a dataspace
    major: Invalid arguments to routine
    minor: Inappropriate type
 h5sclose_f FAILED
 file name obtained from the dataset id is incorrect

The block of text from "HDF5-DIAG" to "file name" is repeated endlessly and after less than a minute the log file already has over 100 MB. I'm not sure this helps but I'm thankful for any tips.

Bes regards,

Ögmundur

Chen, Yuan (Intel)'s picture

Hi, Petersson

Thank you for letting me know this. I just downloaded HDF5 1.8.12 and compiled it with 14.0.0. I used the same configure line as yours. The make check runs successfully. I cannot see your issue.

Could  you show me the error message and how you detect the memory leak? I can have a check.

Thanks

Yolanda

Ögmundur Petersson's picture

Hi,

I just compiled HDF5 1.8.12 after applying your patch above using icc and ifort version 14.0.0 20130728. When running the tests, the fortran hdf5 library tests cause a memory leak and never complete succesfully. My configure line is the following: ./configure --enable-fortran --enable-fortran2003 --enable-cxx. Using version 12.1 of the compilers, the tests complete succesfully.

Kind regards,

Ögmundur Petersson

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.