Gather-Scatter instructions may not be the optimal choice of instructions when you are trying to achieve superior performance on the Intel® Xeon Phi™ coprocessor. However, if your code uses indirect addressing or performs non-unit strided memory accesses, gather-scatter instructions may be the best option.
I have four servers each hosting two phi cards. So far each of the servers export a volume using NFS which is mounted on the phis. This works quite well except that one would like the NFS server to be common for all the phi cards. As the routing is set up by default the mic0 and mic1 will only ping themselves and the host. Request to other servers is not routed. How do I set up the mics and the server to enable mounting of a common server for all the 8 phis ?
[olews@compute-19-20-mic0 olews]$ route
Kernel IP routing table
I am trying to execute the Intel MPI benchmark in the following configuration: CentOS 6.5 with Intel MPI version 4.1.3.045 and MPSS 3.1.2, and OFED 22.214.171.124 installed from source. My network configuration is default (static pair produced by "micctrl --initdefaults"), and I have 1 node with two 3120A Xeon Phi coprocessors.
The MPI benchmark works just fine with fabrics "tcp" or "shm:tcp". Namely, I am able to run the benchmark between localhost and mic0, and between mic0 and mic1. However, with fabric "dapl", I cannot run IMB between localhost and mic0:
I tried to use SCIF RMA to exchange a large amount of data between two MICs but find it's almost impossible to align memory address of both card to 4K page.
Using memory exchange through host gives me 1.7 speed up over single card. How much more performance would SCIF RMA give if I can get it work? I want to decide if I should continue working in this direction.
Thanks a lot.
I am trying to create a static library that uses mic offload feature, but run into a xlib error as below:
With slight modification to the Makefiles, I was able to compile most of the MPSS 3.1.1 and 3.1.2 components from the source o
n Ubuntu 12.04 and have them working. However, micflash -ubpdate is giving me problems, so does micflash -getversion. The mic
flash doesn't seem to be able to read the flash from the mic card. Below are my steps:
Shows the ready state for both the devices
micflash -vv -update -device all
Hi All, My phi card does not seem to be running at its fastest clock Can anyone shed some light what might be happening. It looks like max scaling freq is set to 1238094 whereas it should have been 1333332. Any suggestion how this can be set correctly.
Is there any way to turn off CPU Frequency Scaling on Xeon Phi?
On the remote process, dlopen() failed. The error message sent back from the sink is /var/volatile/tmp/coi_procs/1/5414/load_lib/ifortoutjAzgEs: undefined symbol: cdata_
offload error: cannot load library to the device 0 (error code 20)
On the sink, dlopen() returned NULL. The result of dlerror() is "/var/volatile/tmp/coi_procs/1/5414/load_lib/ifortoutjAzgEs: undefined symbol: cdata_"
I don't know why dlopen should be involved. The source code fragment:
When creating users for the MIC I have discovered that users with UID higher than 65535 are skipped.
This prevents a lot of my users to become included in the access list and passed file created for the mic cards.
Is there a bugfix for this or a workaround ?