Data Collection

Before Intel® Cluster Checker can identify issues, it must first gather data from the cluster. Intel® Cluster Checker uses providers to collect data from the system and stores that data in a database. Framework definitions determine what data to collect by defining a set of data providers to run.

 

Running Data Collection

The clck program triggers data collection followed immediately by analysis. The clck-collect program only triggers data collection. 

Typical invocation of these commands are:

clck -f nodefile

clck-collect -f nodefile

The most basic invocation of these commands includes use of the -f command line option, which specifies a file containing a list of nodes. By default, Intel® Cluster Checker will collect and analyze data to evaluate the health of the cluster using the health framework definition.

Some of the data providers create temporary files and these temporary files must be visible to all nodes in the cluster.

When data collection is executed by a non-privileged user, a temporary directory will be created in the .clck directory within that user’s home directory to store the temporary files. This value can be changed by setting the CLCK_SHARED_TEMP_DIR environment variable.

Limiting the execution of data collection as root will prevent problems from running data providers. There may be cases in which executing as root is necessary, such as when a provider is attempting to access a tool that is not available to a non-privileged user on a system, but limiting execution as root as much as possible is recommended.

When the root user runs data collection, it is necessary to set the value of the CLCK_SHARED_TEMP_DIR environment variable to the path of a directory that is visible to all nodes in the cluster. Failure to set this environment variable will usually result in an aborted run with a message that a shared temporary directory could not be created correctly.

 

Framework Definitions

Framework Definitions, further detailed in the Framework Definitions chapter, can be used to select which providers run when executing clck or clck-collect. Framework Definitions can be specified through the command line by using the -F / --framework-definition command line option. For example, to run myFramework.xml, the following command can be used:

clck-collect -f nodefile -F /path/to/myFramework.xml

Custom Framework Definitions can also be specified in the configuration file /opt/intel/clck/201n/etc/clck.xml. The following example shows how to declare the use of two custom Framework Definitions:

<configuration>
  <plugins>
    <framework_definitions>
      <framework_definition>/path/to/CustomFWD1/xml</framework_definition>
      <framework_definition>/path/to/CustomFWD2/xml</framework_definition>
    </framework_definitions>
  </plugins>
  ...
<configuration>

For more information about Framework Definitions, see the Included Framework Definitions section in the Appendix.

 

Selecting Nodes

The nodefile contains a list of line-separated cluster node hostnames. For compute nodes, the nodefile is a simple list of nodes. For instance, the nodefile provided by a cluster resource manager typically contains just compute nodes and may be used as-is. Intel® Xeon Phi™ coprocessors should be included in the nodefile as independent nodes.

The nodefile is specified using the -f file command line option.

However, in some cases, nodes in the nodefile need to be annotated. The # symbol may be used to introduce comments in a nodefile. Annotations are specially formatted comments containing an annotation keyword following by a colon and a value. Annotations may alter the data collection behavior.

If no nodefile is specified for data collection (via clck or clck-collect), a Slurm query will be used to determine the available nodes.

Node Roles

The role annotation keyword is used to assign a node to one or more roles. A role describes the intended functionality of a node. For example, a node might be a compute node. If no role is explicitly assigned, by default a node is assumed to be a compute node. The role annotation may be repeated to assign a node multiple roles.

For example, the following nodefile defines 4 nodes: node1 is a head and compute node; node2, node3, and node4 are compute nodes; and node5 is disabled.

node1    # role: head role: compute
node2    # role: compute
node3    # implicitly assumed to be a compute node
node4
#node5

Some data providers will only run on nodes with certain roles. For example, data providers that measure performance typically only run on compute or enhanced nodes.

Valid node role values are described below.

boot

  • Provides software imaging / provisioning capabilities.

 

compute

  • Is a compute resource (mutually exclusive with enhanced).

 

enhanced

  • Provides enhanced compute resources, for example, contains additional memory (mutually exclusive with compute).

 

external

  • Provides an external network interface.

 

head

  • Alias for the union of boot, external, job_schedule, login, network_address, and storage.

 

job_schedule

  • Provides resource manager / job scheduling capabilities.

 

login

  • Is an interactive login system.

 

network_address

  • Provides network address to the cluster, for example, DHCP.

 

storage

  • Provides network storage to the cluster, like NFS.

 

Subclusters

Some clusters contain groups of nodes, or subclusters, that are homogeneous within the subcluster but differ from the rest of the cluster. For example, one subcluster may be connected with Intel® Omni-Path Host Fabric Interface while the rest of the cluster uses Ethernet.

The subcluster annotation keyword is used to assign a node to a subcluster. A node may only belong to a single subcluster. If no subcluster is explicitly assigned, the node is placed into the default subcluster. The subcluster name is an arbitrary string.

For example, the following nodefile defines 2 subclusters, each with 4 compute nodes:

node1 # subcluster: eth
node2 # subcluster: eth
node3 # subcluster: eth
node4 # subcluster: eth
node5 # subcluster: ib
node6 # subcluster: ib
node7 # subcluster: ib
node8 # subcluster: ib

By default, cluster data providers will not span subclusters. To override this behavior, use the following clck-collect command line option:

-S / --ignore-subclusters

  • Ignore subclusters when running cluster data providers. That is, cluster data providers will span subclusters. The default is not to span subclusters.

 

Collect Missing or Old Data

A fully populated database is necessary for a complete analysis. However, the database may be partially populated, in which case it is unnecessary to execute a full data collection. To avoid re-collecting valid data by only collecting any data that is missing or old, use the data re-collection feature.

To use this feature, run clck-collect or clck with the -C or --re-collect-data command line option. This option takes no parameters and causes Intel® Cluster Checker to only collect data that is missing or old. This option is useful to avoid executing a full data collection when the database is already populated while still ensuring that all data is present and up to date. If data is missing or old for one or more nodes, that data will be re-collected on all specified (or detected) nodes.

Configuration File Options

The following variables alter the behavior of data collection as options in the configuration file.

Extensions

 

Collect extensions determine how Intel® Cluster Checker collects data. The syntax for selecting a collect extension is as follows:

<collector>
  <extension>mpi.so</extension>
</collector>

Currently, Intel® Cluster Checker uses pdsh by default. The available collect extensions are pdsh (pdsh.so) and Intel® MPI Library (mpi.so), both of which are located at /opt/intel/clck/201n/collect/intel64.

 

CLCK_COLLECT_DATABASE_BUSY_TIMEOUT

 

Specify the amount of time to wait for a database lock to become available.

Environmental variable syntax: CLCK_COLLECT_DATABASE_BUSY_TIMEOUT=value

  • where value is the number of milliseconds to wait for a database lock to become available before giving up. The value must be greater than 0. The default value is 60,000 milliseconds.

 

When inserting a new row into the database, the database is locked and any concurrent write attempts are prevented. This value specifies the amount of time that the concurrent write(s) should wait for the database to be unlocked before giving up. If the timeout expires and the database is still locked, the concurrent write(s) will not be successful and the data will be lost.

 

CLCK_COLLECT_DATABASE_CLOSE_DELAY

 

Specify the amount of time to wait after data collection has finished for data to arrive.

Environmental variable syntax: CLCK_COLLECT_DATABASE_CLOSE_DELAY=value

  • where value is the number of seconds to wait after data collection has finished for any remaining data to be accumulated. The value must be greater than 0. The default value is 1 second.

 

All data that is in the accumulate queue will always be written to the database, but some data may still be on the wire when data collection has finished. This option provides a method to wait an additional amount of time for data to be received by the accumulate server before exiting. Clusters with very slow networks or a very large number of nodes may need to increase this value from the default.

 

CLCK_COLLECT_DATABASE_VFS_MODULE

 

Specify the SQLite* VFS module.

Environmental variable syntax: CLCK_COLLECT_DATABASE_VFS_MODULE=value

  • where value is:
  • unix - Uses POSIX advisory locks when locking the database. Note that the implementation of POSIX advisory locks on some filesystems, for example, NFS, is incomplete and/or buggy. This value should usually only be selected when the database is located on a local filesystem.
  • unix-dotfile - Uses dot-file locking when locking the database. This value usually works around filesystem implementation issues related to POSIX advisory locks. This is the default value
  • unix-excl - Obtains and holds an exclusive lock on the database file. All concurrent database operations will be prevented while the lock is held. This value may help in the event of database errors during data collection of if collected data is missing from the database
  • unix-none - No locking is used. This option should only be used if there is a guarantee that only a single writer will modify the database at any given time. Otherwise, this value can easily result in database corruption if two or more processes are writing to the database concurrently.

 

The SQLite* OS interface layer, or VFS, can be selected at runtime. The VFSes differ primarily in the way they handle file locking. See http://www.sqlite.org/vfs.html for more information.

For more complete information about compiler optimizations, see our Optimization Notice.