Author: Michael Hebenstreit
Contributions: Romain Dolbeau, Jeremy C. Siadal
Version: 0.81, 20130110
This paper is intended to provide readers a blueprint of how to set up and configure a cluster with systems containing the Intel® Xeon Phi™ Coprocessor, based on how Intel configured its own Endeavor cluster. Along the way, specific information about how to compile tools, configure filesystems, and setting up network interfaces is shared in great detail to help understand how this can be done en masse.
To satisfy current standard cluster usage models, where users expect to be able to reach every system that is part of an MPI job via a simple password-less ssh command, and find all the filesystems they expect mounted on every node, some key administrative setup must be performed.
The solution proposed in this document covers the following features:
- users access Xeon Phi coprocessors with standard privileges using direct and passwordless ssh
- the home NFS server is mounted, as well as Lustre* and Panasas* shares
- use of bridged networking to avoid routing problems
- automated detection of installed Intel Xeon Phi coprocessors via lspci
- USER accounts added to all Intel Xeon Phi coprocessor cards on the system, but no password is set
- Removal of inetd on the Intel Xeon Phi coprocessors to maximize securityis
- Correct MTU and NETMASK settings on the Intel Xeon Phi coprocessors Startup of coi_daemon as USER
- Enhancement of dropbear ssh environment with ulimits
- Automated startup of OFED Intel Xeon Phi Coprocessor Communication Link (CCL)
Download the complete article (PDF) and code sample files below: