MPI startup(): ofa fabric is not available

MPI startup(): ofa fabric is not available

I've been using Intel MPI version 4.0.2.003 but I recently installed 4.1.0.024.  I have a script to run IMB and I used it to try out 4.1.0.  The script works fine with 4.0.2, but when I use 4.1.0, I get:

> [0] MPI startup(): ofa fabric is not available and fallback fabric is not enabled
> [1] MPI startup(): ofa fabric is not available and fallback fabric is not enabled

I get the same result with DAPL--works fine with 4.0.2, but fabric not available with 4.1.0.

What's the difference in fabric detection between the two versions?

Thanks.

20 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Hi John,

Let me check with the developers for details about that.  What do you get from "env | grep I_MPI"?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Hi John,

Also, can you please send the output from a run with I_MPI_DEBUG=2?  This will show more details about the fabric selection process.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Hi John,

The developers will also need to know your OS/distribution, the OFED version you are using, and how you are launching the application.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Thanks for your responses.  My I_MPI environment variables are:

> I_MPI_INC=/cray/css/iaa/mpi_images/impi/4.1.0.024/intel64/intel64/include
> I_MPI_F77=/opt/intel/composerxe-2011.5.220/bin/intel64/ifort
> I_MPI_FABRICS=shm:ofa
> I_MPI_LIB=/cray/css/iaa/mpi_images/impi/4.1.0.024/intel64/intel64/lib
> I_MPI_CC=/opt/intel/composerxe-2011.5.220/bin/intel64/icc
> I_MPI_DEBUG=2

Output from the job is:

> [1] MPI startup(): RLIMIT_MEMLOCK too small
> [1] MPI startup(): ofa fabric is not available and fallback fabric is not enabled
> [0] MPI startup(): ofa fabric is not available and fallback fabric is not enabled
> [0] MPI startup(): RLIMIT_MEMLOCK too small

I'm running this on a Cray XC30 running Linux SLESS 11 SP2.  The release is 3.0.42.  I'm launching the job with mpirun.  The runtime environment is provided by CCM, a Cray product that provides a cluster-like execution environment on Cray compute nodes.  The OFA layer is provided by IAA, another Cray product that is a library that provides an IB verbs interface but does data transfer directly across the Cray high-speed network.

As I said in my original post, this works with 4.0.2, but fails with 4.1.0.  I suspect there's some envirnment variable I don't know about, or perhaps I'm setting something incorrectly.

Thanks.

Hi John,

What is the output from ulimit -a?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Ulimit -a gives:

> core file size                  (blocks, -c) unlimited
> data seg size                 (kbytes, -d) unlimited
> scheduling priority                      (-e) 0
> file size                           (blocks, -f) unlimited
> pending signals                           (-i) 257676
> max locked memory        (kbytes, -l) 64
> max memory size          (kbytes, -m) unlimited
> open files                                   (-n) 32768
> pipe size                    (512 bytes, -p) 8
> POSIX message queues    (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) unlimited
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 257676
> virtual memory          (kbytes, -v) 26395360
> file locks                      (-x) unlimited

I saw the "RLIMIT_MEMLOCK too small" and tried 'ulimit -l unlimited', but it didn't make a difference.

Output from 'ulmit -a' is:

> core file size          (blocks, -c) unlimited
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 257676
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 32768
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) unlimited
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 257676
> virtual memory          (kbytes, -v) 26395360
> file locks                      (-x) unlimited

I noticed the "RLIMIT_MEMLOCK too small" messge and tried 'ulimit -c unlimited -l unlimited' but it didn't make any difference (the above output is from after the ulmit command).

Hi John,

Using

ulimit -l unlimited
should have set the locked memory limit to unlimited (which is the one I'm concerned about).  Is there anything in your /etc/security/limits.conf file that is putting a hard limit on locked memory?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

It doesn't look like there's anything in /etc/security/limits.conf.  The whole file is comments and empty lines.

I get an error when I try to use 'ulimit -l':

> ulimit: max locked memory: cannot modify limit: Operation not permitted

Hi John,

Check in /etc/security/limit.d/ for any additional files.  You should be able to modify the locked memory limit.  Are you trying this as root or as a standard user?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

/etc/security/limit.d/ doesn't exist in my environment.

I'm trying to run as a standard user.  It appears that I can reduce the max locked memory, but not increase it:

> jdc@nid00012 ~ $ ulimit -l 16
> jdc@nid00012 ~ $ ulimit -l 32
> -bash: ulimit: max locked memory: cannot modify limit: Operation not permittedjdc@nid00012 ~ $ ulimit -a
> jdc@nid00012 ~ $ ulimit -a
> ...
> max locked memory       (kbytes, -l) 16

How big do I need max locked memory to be?

Hi John,

We recommend setting it to unlocked, but it is dependent on your application.  You very likely have a hard limit set somewhere on your system, a standard user cannot go higher than the hard limit.  If you can't get the memory limit higher, let's try a different approach.  There is a basic test program included with the Intel® MPI Library.  It is in the test/ subfolder of the installation.  Compile any one of the source files present, and try running that program.  The memory usage is extremely low (simply a Hello World program), and it should run with a lower memory limit.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

The "Hello, world" test program runs over TCP, but fails over DAPL and OFA, just as IMB did.

I have a query out about changing the max locked memory limit.  Forgive me for asking this, but do we know that the memory limit is the reason that fabric detection is failing?

Thanks.

Hi John,

I'm not certain that the memory limit is the cause of the problem.  But the memory limit is definitely a problem, otherwise the RLIMIT_MEMLOCK too small messages would not be shown.  I'm checking with the developers for more information.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Hi John,

I've checked with the developers.  The locked memory limit will need to be at least 32 MB in order to run.  This is a requirement set in our code.  Please set it to at least 32 MB and try again.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

I found a (Cray-specific) way to raise the locked memory limit and both the "Hello, world" test and IMB run under 4.1.0 now.

Thanks for your help.

Hi John,

Great!  I'm glad everything is working now.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Can you elaborate on the "cray-specific" way to raise RLIMIT_MEMLOCK?

发表评论

登录添加评论。还不是成员?立即加入