High performance computing clusters are complex environments, making it extremely difficult to identify issues and imbalances. Intel® Cluster Checker provides tools to collect data from the cluster, analysis of the collected data, and provides a clear report of the analysis. Using Intel® Cluster Checker helps to quickly identify issues and improve utilization of resources..
Intel® Cluster Checker verifies the configuration and performance of Linux®-based clusters through analysis of cluster uniformity, performance characteristics, functionality and compliance with Intel® High Performance Computing (HPC) specifications. Data collection tools and analysis provide actionable remedies to identified issues. Intel® Cluster Checker tools and analysis are ideal for use by developers, administrators, architects, and users to easily identify issues within a cluster.
This guide provides step-by-step instructions for using Intel® Cluster Checker tools.
Intel® Cluster Checker identifies issues on clusters and provides recommendations for their resolution using observations and diagnoses. Diagnosis of the cluster is performed in two phases; collection and analysis. Collection gathers pertinent details of the cluster related to the configuration. Analysis is performed on the collected data to produce diagnoses and observations.
Upon running analysis, a brief summary of the results appears on the screen and a log file is generated to the details of the analysis. This summary categorizes each of the observations to give an overview of what types of issues were found. Categories are functionality, hardware uniformity, software uniformity, and performance.
Intel® Cluster Checker is tailored to collect and analyze data to the level and role of the person using the tool. By default, Intel® Cluster Checker will provide a quick analysis of the cluster, with options for more detailed collection and analysis. The
Referencecontains contains additional information including how to run specific sets of checks on a system.