HPCC benchmark failing with Intel MPI

HPCC benchmark failing with Intel MPI

Hello,

I 've compiled the HPCC benchmark suite (http://icl.cs.utk.edu/hpcc/) with Intel MPI, but am facing the following run-time problem:

[bart@head 2x8]$ /share/intel/impi/3.2.1.009/bin64/mpirun -f 1.nodelist -n 16 -r ssh ./hpcc
node002:27686: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27686: reg_mr Cannot allocate memory
node001:2834: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node001:2834: reg_mr Cannot allocate memory
node002:27686: reg_mr Cannot allocate memory
node001:2839: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27679: reg_mr Cannot allocate memory
node002:27681: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27681: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node002:27685: reg_mr Cannot allocate memory
node002:27685: reg_mr Cannot allocate memory
node002:27680: reg_mr Cannot allocate memory
node001:2833: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node002:27680: reg_mr Cannot allocate memory
node001:2833: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27682: reg_mr Cannot allocate memory
node001:2836: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node001:2839: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node001:2838: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
node002:27684: reg_mr Cannot allocate memory
node001:2835: reg_mr Cannot allocate memory
register failed 196608 [10] error(0x30000): OpenIB-cma: DAT_INSUFFICIENT_RESOURCES:

node001:2835: reg_mr Cannot allocate memory
[4:node002][rdma_iba.c:220] Intel MPI fatal error: DTO operation posted for [10:node001] completed with error. status=0x1. cookie=0x4000a
rank 10 in job 1 head_46465 caused collective abort of all ranks
exit status of rank 10: return code 1

The benchmark fails at the start of the HPL part of the benchmark. Any suggestions for fixes would be most appreciated.

Thanks,
Bart

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Correction: it fails during the PTRANS part of the benchmark.

Bart

Quoting - bwillems
Correction: it fails during the PTRANS part of the benchmark.

Bart

Hi Bart,

It's very strange to see that you cannot start HPCC testing because this is quite standard testing. Have you read the article: http://software.intel.com/en-us/articles/performance-tools-for-software-... I hope it will be useful.

Might be your system has not enough memory for 16 processes. Have you tried less?
Could you provide details about your cluster?

Best wishes,
Dmitry

Bart
Please try mpirun -nolocal option because Intel MPI starts processes on local host by default.

Sergey

FYI
The following link has an http form to build your parm file (the HPL.dat file). It attempts to build a parm file that will maximize node usage to obtain best FLOP score possible. But, it's also nice in that it builds a parm file that will use the number nodes/cores you want, without having to understand the format of the parm file.

http://lab.advancedclustering.com/hpl.html

Leave a Comment

Please sign in to add a comment. Not a member? Join today