Intel MKL libraries with remote StatET connection crashes R?

Intel MKL libraries with remote StatET connection crashes R?

Hi everyone,

Is anyone using the Intel MKL math libraries with R, and accessing it remotely using Eclipse + StatET plugin?

I am trying to connect to R 2.15.3 running on Linux from a Windows box running Eclipse 3.8.2 and the StatET plugin for R . Have been using this setup for months without issue. That is until recently, when we recompiled R with the Intel MKL optimized numeric libraries, and are running into problems, but only through StatET.

If I run a simple matrix operation like "crossprod(1:4)" in a local R session on the server, it returns fine. On a more complex operation, it is noticeably faster (one test went from 86 seconds down to 35 seconds), which is what we were hoping to see by using these libraries. But if I start a remote session via Eclipse/StatET, and run the same command, it kills the entire remote R session.

In the log files on the server I see the error"Intel MKL FATAL ERROR: Cannot load libmkl_mc3.so or libmkl_def.so". I've compared LD_LIBRARY_PATH in both the local and remote sessions, and can't find any differences -- both have entries pointing to the Intel MKL directories.

> Sys.getenv("LD_LIBRARY_PATH")
[1] "/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/../lib/amd64:/opt/intel/composer_xe_2013_sp1/mkl/lib/intel64:/opt/intel/composer_xe_2013_sp1.0.080/mkl/lib/intel64/:/opt/intel/composer_xe_2013_sp1.0.080/compiler/lib/intel64:/opt/rproject/R-2.15.3/lib64/R/lib:/opt/openmpi/openmpi-1.4.3/lib:/opt/intel/composer_xe_2013_sp1.0.080/mkl/lib/intel64:/opt/intel/composer_xe_2013_sp1.0.080/compiler/lib/intel64:/opt/openmpi/openmpi-1.4.3/lib"

I've fiddled with numerous settings and environment variables, tried reinstalling StatET components on the server without success. I don't know if this is a StatET problem, an Intel MKL problem, or a local configuration problem.

Any suggestions would be appreciated.

Some additional detail that shows up in the error log as soon as I issue the "crossprod(1:4)" statement that causes the crash:

eclipse.buildId=unknown
java.version=1.7.0_40
java.vendor=Oracle Corporation
BootLoader constants: OS=win32, ARCH=x86_64, WS=win32, NL=en_US
Command-line arguments: -os win32 -ws win32 -arch x86_64
Error Mon Oct 07 11:24:51 EDT 2013
Communication error detail. Send:
MainCmdC2SList ():
<ITEM i="0">
ConsoleReadCmdItem
options= 0xc1000001
<TEXT>
crossprod(1:4)
</TEXT>
</ITEM>
java.rmi.UnmarshalException: Error unmarshaling return header; nested exception is:
java.io.EOFException
at sun.rmi.transport.StreamRemoteCall.executeCall(Unknown Source) at sun.rmi.server.UnicastRef.invoke(Unknown Source) at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(Unknown Source) at java.rmi.server.RemoteObjectInvocationHandler.invoke(Unknown Source) at com.sun.proxy.$Proxy15.runMainLoop(Unknown Source) at de.walware.rj.server.client.AbstractRJComClient.runMainLoop(AbstractRJComClient.java:874) at de.walware.rj.server.client.AbstractRJComClient.answerConsole(AbstractRJComClient.java:1152) at de.walware.statet.r.nico.impl.RjsController.doSubmitL(RjsController.java:1040) at de.walware.statet.nico.core.runtime.ToolController.submitToConsole(ToolController.java:1772) at de.walware.statet.r.nico.AbstractRDbgController.submitToConsole(AbstractRDbgController.java:703) at de.walware.statet.nico.core.runtime.ToolController$ConsoleCommandRunnable.run(ToolController.java:189) at de.walware.statet.r.nico.AbstractRController$RCommandRunnable.run(AbstractRController.java:68) at de.walware.statet.nico.core.runtime.ToolController.loopRunTask(ToolController.java:1260) at de.walware.statet.nico.core.runtime.ToolController.loop(ToolController.java:1066) at de.walware.statet.nico.core.runtime.ToolController.run(ToolController.java:577) at de.walware.statet.nico.core.runtime.ToolRunner.run(ToolRunner.java:85) at de.walware.statet.nico.core.runtime.ToolRunner.access$0(ToolRunner.java:83) at de.walware.statet.nico.core.runtime.ToolRunner$1.run(ToolRunner.java:97) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(Unknown Source) ... 18 more

5 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Keith,

Can you confirm if both MKL and R are installed on Remote machine? Also, can you make sure you set mkl variables on remote machine?

-Sridevi

Sridevi Allam
Technical consulting engineer - Intel MKL

Hi Sridevi,

Thanks for the response.  R and MKL are installed on the remote machine.  It may not have been clear in my original message, but I can log in to the remote machine, run a local R session on that server, and successfully run commands that use MKL (I see multiple cores engaged, performance speed up).  But when I run the same command through a remote R session, R crashes.

I have compared the LD_LIBRARY_PATH variables between both sessions (local vs remote), and both include the same MKL directories.  I tried explicitly sourcing "/opt/intel/mkl/bin/mklvars.sh intel64" from within the remote R session to force any variables that may have not been set, and R still crashes when I run my test command.  Is there something else I can do to check that variables are set properly?

The only error I see in the log file for the remote R session after it crashes is: "Intel MKL FATAL ERROR: Cannot load libmkl_mc3.so or libmkl_def.so." 

Hey OP I too use Intel MKL math libraries with R and i face same problem as OP  ,, can some one reply to this post ?

We were eventually able to get it to work by using statically linked libraries rather than dynamic.

 

发表评论

登录添加评论。还不是成员?立即加入