adding further compute nodes

adding further compute nodes

Hi,

Is there a need to re-install Intel Studio even in the case then I added further compute nodes to my cluster? There exists two infiniband -islands,  ibstat is:

CA 'mlx4_0', CA type: MT4099 and CA 'mlx4_1', CA type: MT26428. The latest  compute nodes are associated to MT4099.

These provider -errors are only present in the 'newer node context'

[2] MPI startup(): dapl fabric is not available and fallback fabric is not enabled
[10] MPI startup(): dapl fabric is not available and fallback fabric is not enabled
[12] MPI startup(): dapl fabric is not available and fallback fabric is not enabled
node009:UCM:2d97:570fa700: 1249 us(1249 us):  open_hca: device mlx4_0 not found
node009:UCM:2d9f:1626f700: 1262 us(1262 us):  open_hca: device mlx4_0 not found
node009:UCM:2da1:7f214700: 1102 us(1102 us):  open_hca: device mlx4_0 not found

 

Regards

Gert

AttachmentSize
Downloadtext/plain ib_provider.txt5.87 KB
2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

One way or another you must assure that all required shared libraries are available on all nodes. As far as impi is concerned, repeating the Mpi part of the installation with an updated node list would take care of it, (after taking care of dapl and hca). Evidently, installing psxe on a shared drive simplifies this aspect of adding nodes.

Leave a Comment

Please sign in to add a comment. Not a member? Join today