User Guide

  • 2021.1
  • 01/08/2021
  • Public Content

Data Collection

Before Intel® Cluster Checker can identify issues, it must first gather data from the cluster. Intel® Cluster Checker uses providers to collect data from the system and stores that data in a database. Framework definitions determine what data to collect by defining a set of data providers to run.
Running Data Collection
program triggers data collection followed immediately by analysis. The
program only triggers data collection.
Typical invocation of the collect commands is:
By default, Intel® Cluster Checker will collect and analyze data to evaluate the health of the cluster using the
framework definition.
Limiting the run of data collection as root will prevent problems from running data providers. There may be cases in which running as root is necessary, such as when a provider is attempting to access a tool that is not available to a non-privileged user on a system, but limiting running as root as much as possible is recommended.
Framework Definitions
Framework Definitions, further detailed in the
Framework Definitions
chapter, can be used to select which providers run when running
. Framework Definitions can be specified through the command line by using the
command line option. For example, to run
, the following command can be used:
Custom Framework Definitions can also be specified in the configuration file
. The following example shows how to declare the use of two custom Framework Definitions:
<configuration> <plugins> <framework_definitions> <framework_definition>/path/to/CustomFWD1/xml</framework_definition> <framework_definition>/path/to/CustomFWD2/xml</framework_definition> </framework_definitions> </plugins> ... <configuration>
For more information about Framework Definitions, see the
Framework Definitions
section in the Reference.
Selecting Nodes
The nodefile contains a list of line-separated cluster node hostnames. For compute nodes, the nodefile is a simple list of nodes. For instance, the nodefile provided by a cluster resource manager typically contains just compute nodes and may be used as-is. Intel® Xeon Phi™ coprocessors should be included in the nodefile as independent nodes.
The nodefile is specified using the
command line option.
However, in some cases, nodes in the nodefile need to be annotated. The # symbol may be used to introduce comments in a nodefile. Annotations are specially formatted comments containing an annotation keyword following by a colon and a value. Annotations may alter the data collection behavior.
If no nodefile is specified for data collection (via
), a Slurm query will be used to determine the available nodes.
Node Roles
annotation keyword is used to assign a node to one or more roles. A role describes the intended functionality of a node. For example, a node might be a compute node. If no role is explicitly assigned, by default a node is assumed to be a compute node. The role annotation may be repeated to assign a node multiple roles.
For example, the following nodefile defines 4 nodes: node1 is a head and compute node; node2, node3, and node4 are compute nodes; and node5 is disabled.
node1 # role: head role: compute node2 # role: compute node3 # implicitly assumed to be a compute node node4 #node5
Some data providers will only run on nodes with certain roles. For example, data providers that measure performance typically only run on compute or enhanced nodes.
Valid node role values are described below.
  • boot - Provides software imaging / provisioning capabilities.
  • compute - Is a compute resource (mutually exclusive with
  • enhanced - Provides enhanced compute resources, for example, contains additional memory (mutually exclusive with compute).
  • external - Provides an external network interface.
  • head - Alias for the union of boot, external, job_schedule, login, network_address, and storage.
  • job_schedule - Provides resource manager / job scheduling capabilities.
  • login - Is an interactive login system.
  • network_address - Provides network address to the cluster, for example, DHCP.
  • storage - Provides network storage to the cluster, like NFS.
Some clusters contain groups of nodes, or subclusters, that are homogeneous within the subcluster but differ from the rest of the cluster. For example, one subcluster may be connected with Intel® Omni-Path Host Fabric Interface while the rest of the cluster uses Ethernet.
annotation keyword is used to assign a node to a subcluster. A node may only belong to a single subcluster. If no subcluster is explicitly assigned, the node is placed into the default subcluster. The subcluster name is an arbitrary string.
For example, the following nodefile defines 2 subclusters, each with 4 compute nodes:
node1 # subcluster: eth node2 # subcluster: eth node3 # subcluster: eth node4 # subcluster: eth node5 # subcluster: ib node6 # subcluster: ib node7 # subcluster: ib node8 # subcluster: ib
By default, cluster data providers will not span subclusters. To override this behavior, use the following
command line option:
Ignore subclusters when running cluster data providers. That is, cluster data providers will span subclusters. The default is not to span subclusters.
Collect Missing or Old Data
A fully populated database is necessary for a complete analysis. However, the database may be partially populated, in which case it is unnecessary to run a full data collection. To avoid re-collecting valid data by only collecting any data that is missing or old, use the data re-collection feature.
To use this feature, run
with the
command line option. This option takes no parameters and causes Intel® Cluster Checker to only collect data that is missing or old. This option is useful to avoid running a full data collection when the database is already populated while still ensuring that all data is present and up to date. If data is missing or old for one or more nodes, that data will be re-collected on all specified (or detected) nodes.
on deprecation: Intel® Cluster Checker will deprecate the re-collect functionality available in the command line or through the configuration file. Rather than only collecting old or missing data, Cluster Checker will run the full data collection phase for the associated framework definitions (FWD).
Environment Propagation
Intel® Cluster Checker will automatically propagate the environment that Intel® Cluster Checker is run on with certain collect extensions. Currently supported by:
  • pdsh
This is done by copying and exporting all environment variables except the following:
  • HOST
  • PMI_FD
  • PWD
  • _
This feature can be turned off:
  • through the environment by running:
  • through turning it off in clck.xml (or whichever configuration file is used)
    • ‘<turn-off-environment-propagation>on</turn-off-environment-propagation>’
  • or by running with the ‘-e’ flag
Configuration File Options
The following variables alter the behavior of data collection as options in the configuration file.
Collect extensions determine how Intel® Cluster Checker collects data. To change which collector extension is used, edit the file
. The syntax for selecting a collect extension is as follows:
<collector> <extension></extension> </collector>
Currently, Intel® Cluster Checker uses pdsh by default. The available collect extensions are pdsh ( and Intel® MPI Library or MPICH (, both of which are located at
Use of other MPI varieties outside of Intel® MPI Library and MPICH are not expected to work.
when you chose a specific MPI, this MPI will be used for both launching Cluster Checker and running any possible MPI workloads in the framework definitions requested. The use of MPICH to run framework definitions of Intel MPI Benchmarks (IMB) or HPCG Benchmarks will not work. The IMB benchmarks are found in framework definitions starting with ‘
’ and are also found in a handful of other framework definitions that run benchmarks such as ‘health_extended_user’, or ‘select_solutions_sim_mod_benchmarks_plus_2018.0’.
In order for Intel® MPI Library or MPICH to be succesfully used, the clck.xml file needs to have uncommented the extension and the
information for the desired MPI must be correct. For Intel® MPI Library, insure the appropriate script is sourced, i.e.
or similar. For MPICH
insure PATH and LD_LIBRARY_PATH are configured as defined by the MPICH Installers Guide. i.e. for Bash; export
Specify the amount of time to wait for a database lock to become available.
Environmental variable syntax:
is the number of milliseconds to wait for a database lock to become available before giving up. The value must be greater than 0. The default value is 60,000 milliseconds.
When inserting a new row into the database, the database is locked and any concurrent write attempts are prevented. This value specifies the amount of time that the concurrent write(s) should wait for the database to be unlocked before giving up. If the timeout expires and the database is still locked, the concurrent write(s) will not be successful and the data will be lost.
Specify the amount of time to wait after data collection has finished for data to arrive.
Environmental variable syntax:
is the number of seconds to wait after data collection has finished for any remaining data to be accumulated. The value must be greater than 0. The default value is 1 second.
All data that is in the accumulate queue will always be written to the database, but some data may still be on the wire when data collection has finished. This option provides a method to wait an additional amount of time for data to be received by the accumulate server before exiting. Clusters with very slow networks or a very large number of nodes may need to increase this value from the default.
Specify the SQLite* VFS module.
Environmental variable syntax:
  • unix
    Uses POSIX advisory locks when locking the database. Note that the implementation of POSIX advisory locks on some filesystems, for example, NFS, is incomplete and/or buggy. This value should usually only be selected when the database is located on a local filesystem. unix-dotfile - Uses dot-file locking when locking the database. This value usually works around filesystem implementation issues related to POSIX advisory locks. This is the default value
  • unix-excl
    Obtains and holds an exclusive lock on the database file. All concurrent database operations will be prevented while the lock is held. This value may help in the event of database errors during data collection of if collected data is missing from the database
  • unix-none
    No locking is used. This option should only be used if there is a guarantee that only a single writer will modify the database at any given time. Otherwise, this value can easily result in database corruption if two or more processes are writing to the database concurrently.
The SQLite* OS interface layer, or VFS, can be selected at runtime. The VFSes differ primarily in the way they handle file locking. See for more information.
Specify the amount of time to wait for a collect extention to finish before closing.
Environmental variable syntax:
is the number of seconds to wait after for the extension to finish. The value must be greater than 0. The default value is 1 week.
Custom libfabric Provider
Some of the framework definitions use Intel® MPI Library to run MPI benchmarks; for example the Select Solutions framework
or IMB frameworks such as
. In some scenarios it may be desirable to use a different Libfabric OFI provider when running your MPI application, including those run through Cluster Checker.
To override what Cluster Checker selects you can set the environment variable
to a specific libfabric provider. A list of what your server supports can be discovered by running the command
. We suggest you set I_MPI_OFI_PROVIDER in your .bashrc file or job submission script;
Where the value ‘sockets’ is replaced by a libfabric provider listed in the output of
By default Intel® Cluster Checker and Intel® MPI Library will choose an optimized fabric provider, but there are scenarios where it is worthwhile to override those defaults for testing.
Note: Intel® TrueScale InfiniBand
- TrueScale IB only has Intel® MPI 2018 support, newer version of Intel(r) MPI Library do no support TrueScale. TrueScale also has limited newer Operating System support. When collecting data with TrueScale IB, be sure to have the environment variables I_MPI_FABRICS=tmi and I_MPI_TMI_PROVIDER=psm present either in slurm script or exported.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at