on a cluster of Sandy Bridge nodes connected together by FDR IB, we are adding two Intel Phis
per node. There are two possible PCIe slot assignments of the two PHis vs the IB HCA.
a) Both Phis go to PCIe slots with lanes attaching to the same processor socket and the IB HCA
stays on its own on the PCIe lanes of the other socket, and
b) One Phi and the IB HCA go to the PCie lanes of the same socket and the other Phi stays by
itself on the lanes that go to the other processor socket.
Any expereince or thoughts concerning the pros and cons of assignments a) and b) ?
In principle a) should make intra-MIC data transfers more efficient, while b) should facilitate
transfers of data directly from one MICs memory via the IB HCA to / from other nodes,
hopefully using RDMA thus eliminating copying data to host memory first.
Any suggestions, particularly from Intel MPI developer community ?