One of the key benefits of Intel Cluster Checker is that it acts as a real-like user executing commands in the console. The required knowledge to identify what to execute and what to check is then available to anyone.
For instance, to ensure that the coprocessor is up and running, there are three recommended basic steps: two involving the execution of Intel® Many Integrated Core (MIC) Platform Software Stack (MPSS) supporting tools (micinfo and miccheck) plus the execution of a benchmark on the host system that offload work to speed up computation.
The micinfo test module checks that coprocessor information is correct and uniform across nodes. Any error, undefined value or abnormal difference among coprocessors is reported when it may impact cluster productivity. Being more specific: it makes sure that processor frequency, voltage, memory and speed are non-zero and only differences smaller than 128 MB, 100000 uV or 20 C are allowed. This default behavior can be altered by custom configuration if required.
The miccheck test module checks the sanity of the coprocessor cards by running miccheck diagnostic tools in every node in parallel. Only the non-passing checks are reported. To run a benchmark which offloads work to a coprocessor, two related environment variables need to be specified (both to force offload and to enable reporting).
The Intel Math Kernel library is smart enough to know when to offload, note that the default input size is too small to justify offload, so a bigger input must be defined.
To run a single execution to check the previous three recommended steps the following command might execute:
$ OFFLOAD_REPORT=2 MKL_MIC_ENABLE=1 clck -I micinfo -I miccheck -I dgemm
It is also required that the environment is properly set for Intel C/C++ Compiler and Intel Math Kernel Library before attempting execution.
See the Intel Cluster Checker product documentation for more details.
To download the latest release, log into the Intel® Registration Center and click on the Intel® Cluster Checker product.