• 4.0.0
  • 04/10/2020
  • Public Content

Environment Setup
  • Intel® Cluster Checker must be accessible by the same path on all nodes.
  • A readable, writable shared directory must be available from the same path on all nodes for temporary file creation. Intel® Cluster Checker uses $HOME as the shared directory by default, but you can change this option by setting the environment variable CLCK_SHARED_TEMP_DIR to the shared directory.
    • For admin privileged users, such as root, the environment variable $CLCK_SHARED_TEMP_DIR must be explicitly set.
  • source mpivars.[sh | csh] from Intel® MPI Library
  • source mklvars.[sh | csh] from Intel® Math Kernel Library (Intel® MKL)
  • source psxevars.[sh | csh] or compilervars.[sh | csh] from Intel® Parallel Studio XE Cluster Edition
  • source clckvars.[sh | csh] from Intel® Cluster Checker
  • On a multi-node machine, either passwordless ssh between all nodes or Intel® MPI Library is required.
    • Passwordless ssh Setup
    • Intel® MPI Library Setup
      • Either locate and edit the clck/2019.5/etc/clck.xml file to uncomment the line containing “<extension>mpi.so</extension>” by removing the commenting statements “<!–” before it and the “–>” after it.
      • Or, locate and copy the clck/2019.5/etc/clck.xml file locally and uncomment the “<extension>mpi.so</extension>” and add the following option when executing Intel® Cluster
        • -c <path_to_local_copy_of_clck.xml>
Initial Execution
This section will show you how to run for our two supported scenarios:
Setup with Individual Nodefile
A nodefile specifies which nodes to include and, if applicable, their roles. Intel® Cluster Checker contains a set of pre-defined roles. A separate hostname appears on each line. If no role is specified for a node, that node is considered a compute node. The following example includes four compute nodes.
node1 node2 node3 node4
A cluster with a single node would only include one hostname in the nodefile.
You can then do your first run for Intel® Cluster Checker by running
clck
-f
<nodefile>
Setup With Slurm
Regardless of whether you are using a batch script via (sbatch) or allocating nodes (salloc), Intel® Cluster Checker uses allocated list of nodes automatically.
If running on the commandline with salloc, remember to then set up the environment. You can then invoke Intel® Cluster Checker by running:
clck
If running with sbatch, you should be able to run Intel® Cluster Checker by using a script that must include the environment setup above:
source
mpivars.
[
sh
|
csh
]
source
mklvars.
[
sh
|
csh
]
source
psxevars.
[
sh
|
csh
]
or compilervars.
[
sh
|
csh
]
source
clckvars.
[
sh
|
csh
]
clck
You can then run
sbatch
<script_name>
In both of the above cases, Intel® Cluster Checker should generate a summary output and an in-depth clck_results.log file.
User-Specific Workflows
Intel® Cluster Checker uses Framework Definitions to specify what data is collected, how data is analyzed, and how that information is displayed. By default, Intel® Cluster Checker runs the health_base Framework Definition, which provides an overall examination of the health of the cluster. Intel® Cluster Checker provides a wide variety of Framework Definitions to customize your results and all Framework Definitions are located in the Intel® Cluster Checker install directory in the path etc/fwd. We describe here the highest level Framework Definitions for particular types of users; however, you can get a full list of available Framework Definitions by running
clck
-X
list
You can also find a full list of Framework Definitions online.
Admin:
For the privileged user, there are four different common-use Framework Definitions for cluster analysis. When first running as an administrator, run
clck
<options>
-F
health_base
You can then look in the file clck_results.log to read the in-depth results of the analysis. These are preliminary checks that would work for either user or administrator. For a more comprehensive, administrator-specific run, next run
clck
<options>
-F
health_admin
If you want to do more in-depth checking of the intricacies of your cluster’s uniformity, you can use the Framework Definitions system_configuration_uniformity, which will ensure the system is configured uniformly across nodes, kernel_parameter_uniformity, which will give an analysis of the uniformity of the kernel setup, or lshw_hardware_uniformity, which will find discrepancies in hardware or firmware between nodes.
clck <options> -F system_configuration_uniformity clck <options> -F kernel_parameter_uniformity clck <options> -F lshw_hardware_uniformity
You can run all of the above in a single run by running multiple framework definitions at once.
clck
<options>
-F
health_admin
-F
system_configuration_uniformity
-F
kernel_parameter_uniformity
-F
lshw_hardware_uniformity
This command will provide preliminary analysis on the screen, with more details available by default in the file clck_results.log. At this point you can explore other framework options to find what serves your needs best.. Be aware that some of the user-level Framework Definitions do not run well as root since they include execution of an MPI parallel application.
User:
For the non-privileged cluster user, there are two common-use Framework Definitions for cluster analysis. When first running, run
clck
<options>
-F
health_base
You can then look in the file clck_results.log to read the in-depth results of the analysis. In the event that you desire more extended checking, you can next run
clck
<options>
-F
health_extended_user
This command will provide preliminary analysis on the screen, with more details available by default in the file clck_results.log. At this point you can explore other framework options to find what serves your needs best. Be aware that not all tools are user-accessible so some may report data missing..
Intel® MPI Library Troubleshooting
Admin:
For the privileged user wanting to make sure their cluster is set up to work with the Intel® MPI Library, run
clck
<options>
-F
mpi_prereq_admin
This Framework Definition helps debug BIOS, software, environment, and hardware issues that could be causing sub-optimal performance or problems using the Intel® MPI Library.
User:
For the non-privileged user wanting to make sure their cluster is set up to work with the Intel® MPI Library, run
clck
<options>
-F
mpi_prereq_user
This Framework Definition helps debug environment and software issues that could be causing sub-optimal performance or problems using the Intel® MPI Library.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804