Monitoring Intel® Xeon Phi™ Clusters Using Ganglia

Installing Ganglia

The MPSS User's Guide will walk you through a basic Ganglia install. A few points you won't want to miss as you go through these instructions are:

1) There are a number of additional software packages required to build the Ganglia code on the host system. The MPSS User's Guide provides a list of these packages. One of these packages, libConfuse, (libconfuse0 on SUSE systems) is not available in some major Linux* distributions. Possible sources for this code are 

    http://www.nongnu.org/confuse/ for RHEL distributions
    http://download.opensuse.org/distribution/{version_number}/repo/oss/suse/x86_64/ for SUSE distributions

2) The Ganglia files for use on the coprocessor are found in the mpss-[version number]-k1om.tar file, which is available from the same location as the main MPSS tar file. Following only the directions in the MPSS User's Guide, rpms from that tar file will be reinstalled each time the coprocessor is booted, using one of three methods listed under the heading "Installing Card Side RPMs". In addition to the coreutils*.k1om.rpm and libgmp*.k1om.rpm files, you will need the files: ganglia-[version].k1om.rpm, mpss-ganglia-mpss-[version].k1om.rpm, libconfuse0[version].k1om.rpm and libapr-[version].k1om.rpm

Now for the rest of the story ----

An Alternate Theory of Installation

When using Ganglia, there are advantages to using an NFS mounted file system as the root file system on the coprocessor. If you are administering a large cluster, you are probably already doing this. For systems using Ganglia, the NFS mounted file system has three advantages.

1) It saves space in coprocessor memory.  Ganglia requires a number of files from the mpss-[version number]-k1om.tar file to be installed in the root file system. This increases the size of the root file system and if that root file system is a RAM file system, it equates to less available memory for running programs.

2) You do not need to reinstall Ganglia on the coprocessor each time it boots. You only need to do it once when you install a new version of the MPSS.

3) The gmond.conf file stays in /etc on the coprocessor's file system and can be maintained with whatever version control system you use to track other configuration files in /etc. If you make any changes to gmond.conf and reinstall Ganglia each time you boot, then you must be sure to copy over the default gmond.conf with your modified version each time you boot as well.

If you don't want to use NFS, another alternative is to use a static cpio file containing the image of the root file system after Ganglia has been installed. This has the same advantages as points 2 and 3 above for an NFS file system. To build and use this static image, execute the following on the host system:

ssh root@mic0 "cd / ; find . /dev -xdev ! -path "./etc/modprobe.d*" ! -path "./var/volatile/run*" | cpio -o -H newc | gzip -9" > /usr/share/mpss/boot/custom.cpio.gz
micctrl --rootdev=StaticRamFS --target=/usr/share/mpss/boot/custom.cpio

The first command builds a compressed cpio file of the complete coprocessor RAM file system by using ssh to connect to the coprocessor and execute a find command on the / and /dev directories, excluding any files that are not part of the RAM file system itself (for example, it prunes any NFS mounted directories) and excluding run state files from /etc/modprobe.d and /var/volatile. Still on the coprocessor, it then pipes the list of files from the find command to the cpio command to build the cpio archive, and then to the gzip command to compress the output. Finally it returns the completed cpio file to the host as /usr/share/mpss/boot/custom.cpio.gz. The second command modifies the configuration of the MPSS to use the StaticRamFS rather than the default RamFS boot sequence. 

Each time you boot the coprocessor, this cpio file will be used rather than the file /var/mpss/micN.image. In general, you will need to replace this custom.cpio file only when you install a new MPSS, make a change to a file in /etc (note that this includes the passwd file) or add, remove or reconfigure any additional software.

How Should You Configure Your Ganglia

The Intel® Xeon Phi™ Coprocessor System Software Developers Guide provides a brief overview of Ganglia. The guide describes the default configuration given in the MPSS installation instructions as a reference implementation. In actuality, the default configuration provided in the MPSS release requires some changes in order to be useful.

For A Single Node

A good configuration if you have only one node with Intel® Xeon Phi™ cards installed would be to have each coprocessor send the output from its gmond daemon directly to the gmond daemon on the host. If you wished to include the information from the host as well, you would need to transmit the host data from the send side to the receive side of the host's gmond daemon. The gmetad daemon running on the host would pull in the data from the host's gmond and extract the information into a database. To make use of that data, you would need to either install a web server on your host or mount the database and html directories onto a system which did have a web server. This single node configuration, assuming two coprocessors per node, is shown in figure below.

To implement this configuration:

  • Modify the /etc/ganglia/gmond.conf on the coprocessor
    - in the cluster section, ensure that "name" is set to the unique name you have assigned your system,
    cluster {
      name = "mic_cluster" /*Cluster name, must match with every other node in the cluster
      owner = "unspecified"
      latlong = "unspecified"
      url = "unspecified"
    } 
    - in the udp_send_channel section, ensure that "host" is set to the address by which the coprocessor accesses its host and "port" is set to 8649
    udp_send_channel {
      host = 172.31.1.254
      port = 8649
      ttl = 1
    }
  • ​​Modify the /etc/ganglia/gmond.conf on the host
    - in the udp_recv_channel section, comment out the mcast_join and bind lines; make sure port is set to 8649
    udp_recv_channel {
      /* mcast_join = 239.2.11.71 */
      port = 8649
      /* bind = 239.2.11.71 */
    }
    - in the cluster section,  ensure that name is set to the same value as on the coprocessors
    cluster {
      name = "mic_cluster"
      owner = "unspecified"
      latlong = "unspecified"
      url = "unspecified"
    }
    - in the udp_send_channel section, if you will be collecting information from the host as well as the coprocessors, change the mcast_join = 239.2.11.71 to host = localhost; make sure port is set to 8649
    udp_send_channel {
      host = localhost
      port = 8649
      ttl = 1
    }
    
    - if you will not be collecting information from the host, comment out the udp_send_channel section.
  • Modify the /etc/ganglia/gmetad.conf on the host
    - ensure that the first uncommented line in the file is
    data_source "name" localhost
    where "name" is the same unique name that you have used as the cluster name in the gmond.conf files

For A Cluster

There are a number of possible configurations for monitoring a cluster. The simplest configuration might be to have the gmond on each coprocessor and on each host node send data to a gmond on one designated node where it would be aggregated for a gmetad. To do this, the address in the udp_send_channel section of the gmond.conf on each coprocessor would need to be changed to point to this designates node. This configuration is shown in the figure below. (Warning – some users have had trouble with this configuration due to aliasing of the coprocessor addresses to their host’s IP address.) daemon. This configuration is shown in the figure below.

Another configuration, which could decrease network traffic, would be to collect the data on each node of the cluster as in the first example. Next, on one designated node of the cluster or on a server outside the cluster, you would install gmetad, along with the html files from the ganglia directory in the MPSS. Unless you want to also monitor this designated node, it is not necessary to also install gmond there. To make use of that data, you would need to either install a web server onto that designated node or mount the database and html directories onto a system which did have a web server. This configuration is shown in the figure below.

To implement this configuration:

  • configure gmond.conf for the coprocessors and hosts as in the case of the single node configuration
  • do not start the gmetad on the individual hosts
  • in gmetad.conf on the designated node
    - ensure that the first uncommented lines in the file are
    data_source "name" xxx.xxx.xxx.xxx
    where "name" is the same unique name that you have used as the cluster name in the gmond.conf files and there is one line for each host node, where xxx.xxx.xxx.xxx is the IP address of that node

Conclusion

At this point you should have a working Ganglia installation. Which metrics you choose to monitor will depend on your reason for monitoring the system, such as system health (temperature, power) or usage (memory, load averages). Remember that monitoring the system can negatively affect performance of running processes, because of the overhead from running the daemons. When you choose metrics, choose wisely. Play around, see what works for you. Happy monitoring!

Other Related Articles

    https://software.intel.com/en-us/articles/configuring-intel-xeon-phi-coprocessors-inside-a-cluster
    https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss

 

For more complete information about compiler optimizations, see our Optimization Notice.