Intel® Cluster Checker 2021 Current Beta (beta05) for Linux* - Release Notes ------------------------------------------------------------------------------- CONTENTS -------- 1. OVERVIEW 2. NEW FEATURES 3. SYSTEM REQUIREMENTS 4. WHERE TO FIND THE RELEASE 5. INSTALLATION NOTES 6. DOCUMENTATION 7. KNOWN LIMITATIONS AND TROUBLESHOOTING 8. TECHNICAL SUPPORT 9. DISCLAIMER AND LEGAL INFORMATION ------------------------------------------------------------------------------- 1. OVERVIEW ------------------------------------------------------------------------------- Intel® Cluster Checker verifies the configuration and performance of Linux*-based clusters and checks the cluster's compliance with the Intel® Select Solutions for Simulation and Modeling. ------------------------------------------------------------------------------- 1.1. RELATED PRODUCTS AND SERVICES ------------------------------------------------------------------------------- Information about Intel® oneAPI beta software development products is available at http://www.intel.com/software/en-us/oneapi These are some of the products related to Intel® Cluster Checker: o The Intel® oneAPI Data Parallel C++ and Fortran Compilers include advanced optimization and multithreading capabilities, highly optimized performance libraries, and analysis tools for creating fast reliable multithreaded applications. http://www.intel.com/software/en-us/oneapi o The Intel® oneAPI HPC Toolkit contains the MPI Library for Linux*, the Intel® Trace Analyzer and Collector for Linux*, and the Intel® Math Kernel Library Cluster Edition for Linux*. These award-winning development tools are used to create, analyze, and optimize high-performance applications on clusters of Intel® processor-based systems. http://www.intel.com/en-us/oneapi/hpc-kit ------------------------------------------------------------------------------- 2. NEW FEATURES ------------------------------------------------------------------------------- 2.1 WHAT'S NEW IN VERSION 2021 Beta 5 ------------------------------------------------------------------------------- - Extensive changes made to the formatting of the output to enhance readability and parsing of the analysis. - Grouping of issues by type; functionality, performance, uniformity - Summary report on the number of each type of issue - Separation of Cluster Checker execution issues from system environmental issues - Detailed analysis and recommendations are detailed in an analysis log file - An option to redirect the generated reports from Cluster Checker into a JSON formatted file to enable separate reporting and analysis by user tools - To enable, include clck_json as part of the block of the cluster checker configuration XML file - default file /etc/clck.xml - Extended the collection and analysis capabilities of Intel® Cluster Checker to include the OSU Micro-Benchmark Suite (must be downloaded and installed separately: https://mvapich.cse.ohio-state.edu/benchmarks/). The OSU Collectives Benchmark Suite includes point-to-point, blocking and non-blocking collectives benchmark tests that verifies the functionality of the MPI functions. - Introduction of new Intel® MPI blocking and non-blocking collectives bench- mark tests for the verification of functionality of Intel® MPI. The results are collected and analyzed. - Introduction of a new beta feature to perform analysis on specific node groups defined by the user or system administrator in a nodegroup file. This beta feature only applies to the analysis of the nodes, not the collection of data on the nodes. An example node group file is available in /etc/example_group_file.xml. The documentation provides further details on this new beta feature. - CVE fixes for libxml2 - CVE-2020-7595 and CVE-2019-20388 ------------------------------------------------------------------------------- 2.2 OLDER VERSIONS ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- 2.2.1 VERSION 2021 Beta 4 ------------------------------------------------------------------------------- - Common Vulnerabilities and Exposures (CVE) for SQLite3: Updated SQLite3 package to sqlite3.31.0 to address a group of CVEs identified in late 2019. ------------------------------------------------------------------------------- 2.2.2 VERSION 2021 Beta 2 / 2021 Beta 3 ------------------------------------------------------------------------------- - Added environment modules support for Intel® Cluster Checker. - The Environment Module file is found /clck/2021.1-beta02/env/modulefile/ - The command: module use /clck/2021.1-beta02/env/modulefile/ will add clck to your module environment - module av will show what modules are available to be loaded via module load - Added patch to address SQLite CVE-2019-9937. - Improved clarity and response output for certain checks. ------------------------------------------------------------------------------- 2.2.2 OLDER RELEASE NOTES ------------------------------------------------------------------------------- - The release notes for older, versions of Cluster Checker can be found at: https://software.intel.com/en-us/articles/intel-cluster-checker-release-notes -and-new-features ------------------------------------------------------------------------------- 3. SYSTEM REQUIREMENTS ------------------------------------------------------------------------------- The following sections describe hardware and software requirements. ------------------------------------------------------------------------------- 3.1. HARDWARE ------------------------------------------------------------------------------- - Intel® Xeon® processor (Intel® 64 architecture) - 1 GB of RAM recommended - 160 MB of free hard disk space required for installation ------------------------------------------------------------------------------- 3.2. SOFTWARE ------------------------------------------------------------------------------- Operating Systems: - CentOS 6 or 7 - Red Hat* Enterprise Linux* 6 or 7 - SUSE* Linux* Enterprise Server 11 or 12 - Ubuntu* 14.04, 16.04, or 17.04 (See Section 7 for known issues) Runtimes: - Intel® MPI Library Note: While the full SDK versions of these components fulfill the requirement, only the runtime library is required. ------------------------------------------------------------------------------- 4. WHERE TO FIND THE RELEASE ------------------------------------------------------------------------------- Intel® Cluster Checker can be installed with Intel® oneAPI HPC Toolkit. See the Installation section of the User Guide or the Installation documentation for the Intel® oneAPI HPC Toolkit for more information. ------------------------------------------------------------------------------- 5. INSTALLATION NOTES ------------------------------------------------------------------------------- The default Intel® Cluster Checker install path is: /opt/intel/inteloneapi/clck/2021.1-beta05 Notes: - Intel® Cluster Checker needs to be installed on all nodes. This can either be accomplished either by installing into a shared directory or by installing a local copy on each node. - To install a local copy on each node, repeat the package installation for each node. ------------------------------------------------------------------------------- 6. DOCUMENTATION ------------------------------------------------------------------------------- This release of Intel® Cluster Checker includes the following documentation: The Getting Started Guide walks through using Intel® Cluster Checker for the first time. The Intel® Cluster Checker User's Guide contains information about how to use, configure, and extend Intel® Cluster Checker. The User's Guide describes the basic usage models, contains information about specific configuration options, explains how to embed Intel® Cluster Checker functionality into other applications, shows how to add new checks to the tool, and demonstrates how to modify existing checks. The Intel® Cluster Checker API reference describes the API that may be used to embed Intel® Cluster Checker functionality into other software programs. The documentation can be found at: https://software.intel.com/en-us/intel-cluster-checker-support/documentation. ------------------------------------------------------------------------------- 7. KNOWN LIMITATIONS AND TROUBLESHOOTING ------------------------------------------------------------------------------- The following is a list of known issues in this release. - Data collection behavior and functionality o Currently Cluster Checker will not collect data for the HPCG benchmarks correctly if Intel® MPI and MPICH (www.mpich.org) are both installed on the environment beting tested. The test environment will look for and execute the Intel® MPI optimized binary for HPCG and thus reset the environment variables for MPICH. Discovery and handling of this limit- ation will be corrected in future versions of Intel® Cluster Checker. o imb_pingpong_fabric_performance framework definition when launched with an odd number of nodes through Slurm, with MPI as the collector mechanism (mpi.so), will report no-data for the last server assigned to the slurm job. Workaround involves using a nodefile to specifically test the last server where ‘no-data’ was reported with another server in the infrastructure. o The compute node hostname identified in the nodefile must match the hostname reported by the either the uname or hostname utility on the compute node itself. Deviations in the hostnames, or use of fully qualified domain names in either the nodefile or the compute node, may impact or produce inaccurate uniformity percentages and counts and be reported as a failure or warning by Cluster Checker. o Please note that for execution of HPCG benchmarks (such as in the checks hpcg_single and hpcg_cluster) on non-standard install path for the Intel® MPI Library and Intel® Math Kernel Library (Intel® MKL) runtime, libraries must be installed and be exported in the LD_LIBRARY_PATH on the system. o Use of the latest runtime libraries for Intel® MPI Library and Intel® Math Kernel Libary is required to ensure compatibility with Intel® Cluster Checker. o If the temporary directory used during collection is located on a shared file system, the directory will not be deleted. o The ORCM plugin is a technical preview feature. o Databases located on NFS file systems mounted with the "nolock" option are not supported. Not all data from concurrent data collection instances per database will be written to the database and the database may become corrupted. A single data collector instance per database can usually be used successfully in this case. o The error "Error: disk I/O error" may be generated when accessing a database located on a Lustre file system. The Lustre file system must be mounted with the "-o flock" option. o The 'iozone' data provider does not execute correctly on diskless clusters. o If collecting data as root, the value of the CLCK_SHARED_TEMP_DIR environment variable must be set to the fully-qualified path of a directory accessible on all nodes. o When collecting data on Ubuntu*, if the installed "which" command does not support --skip-functions and --skip-alias, a few providers will need additional configuration and a few providers will not run successfully. The following providers must be configured for the specification of absolute binary location: - cpuid - cpupower - dmesg - ibstat - lscpu - numactl - opahfirev - opasmaquery Refer to Intel® Cluster Checker User Manual, Chapter 6 for details about specifying absolute binary paths for the above mentioned providers. o Intel® Cluster Checker uses the command "ldconfig -p" as well as the environment variable LD_LIBRARY_PATH to detect the presence of required libraries. In order for Intel® Cluster Checker to detect required libraries, they must be present in the LD_LIBRARY_PATH or the result of "ldconfig -p". (Applies to the Framework Definitions second-gen-xeon-sp_user, second-gen-xeon-sp_priv, intel_hpc_platform_compat-hpc-2018.0, intel_hpc_platform_sdvis-core-2018.0, and intel_hpc_platform_second-gen-xeon-sp-2019.0) o In order for Intel® Cluster Checker to detect the Intel® Distribution for Python*, it must be in the user’s PATH. (Applies to the Framework Definitions second-gen-xeon-sp_user, second-gen-xeon-sp_priv, intel_hpc_platform_compat-hpc-2018.0, and intel_hpc_platform_second-gen-xeon-sp-2019.0) o If Intel® Parallel Studio is sourced before the Intel® Distribution for Python* in the user's environment, Intel® Cluster Checker is unable to detect all the required libraries for Intel® MPI Library. (Applies to the Framework Definitions second-gen-xeon-sp_user, second-gen-xeon-sp_priv, intel_hpc_platform_compat-hpc-2018.0, and intel_hpc_platform_second-gen-xeon-sp-2019.0) o The detected version of Intel® MPI Library is used to determine whether Intel® Cluster Checker checks for Intel® Parallel Studio 2018 or 2019. If the Intel® MPI Library version does not match the version of the rest of Intel® Parallel Studio, the wrong set of libraries will be checked. (Applies to the Framework Definition intel_hpc_platform_compat-hpc-2018.0) o Intel® Cluster Checker can only detect the version of the Intel® Fortran Compiler version with Intel® Parallel Studio 2017 or later. (Applies to the Framework Definitions second-gen-xeon-sp_user, second-gen-xeon-sp_priv, intel_hpc_platform_compat-hpc-2018.0, and intel_hpc_platform_second-gen-xeon-sp-2019.0) o In addition, there are limitations to validating Intel® Select Solutions compliance when running on Ubuntu. It is not recommended to use Intel® Cluster Checker for Intel® Select Solutions compliance when running on Ubuntu. - Analysis behavior and functionality o Clusters containing dual port InfiniBand* adapters where the second port is unused should suppress the 'infiniband-port-physical-state-not-linkup' and 'infiniband-port-state-not-active' signs. See Chapter 4 of the User's Guide for more information on how to suppress signs. o When using the Linux* boot parameter isolcpus with an Intel® Xeon Phi(TM) processor using default MPI settings, MPI based applications may fail. If possible, change or remove the isolcpus Linux* boot parameter. If this is not possible and you are using the Intel® MPI Library, you can try setting I_MPI_PIN to off. Refer to the Intel® Cluster Checker reference manual for details on specifying environment variables for tests. o When run with dgemm/dgemm_cpu_performance or stream/stream_memory_bandwidth_performance framework, "stream-outlier" or "dgemm-data-is-substandard" may be observed as the corresponding provider scripts may not yield the expected performance with SNC-2/SNC-4 cluster mode and Flat memory mode configurations for Intel® Xeon Phi(TM) processor. There may be an issue with the kernel itself (BZ#1479763), documented at https://access.redhat.com/errata/RHBA-2017:2581 If there are no corresponding diagnoses, the signs may be suppressed. o The sign paraview-missing fires despite ParaView* being present on the system. (Applies to the Framework Definition intel_hpc_platform_sdvis-cluster-2018.0) ------------------------------------------------------------------------------- 8. TECHNICAL SUPPORT ------------------------------------------------------------------------------- If you did not register Intel® Cluster Checker during installation, please do so at the Intel® Software Development Products Registration Center at http://registrationcenter.intel.com. Registration entitles you to free technical support, product updates and upgrades for the duration of the support term. For information about how to find Technical Support, Product Updates, User Forums, FAQs, tips and tricks, and other support information, please visit: https://software.intel.com/en-us/oneapi/support Note: If your distributor provides technical support for this product, please contact them for support rather than Intel. ------------------------------------------------------------------------------- 9. DISCLAIMER AND LEGAL INFORMATION ------------------------------------------------------------------------------- No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at Intel.com, or from the OEM or retailer. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm. Intel, the Intel logo, Xeon, and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others © 2020 Intel Corporation. Optimization Notice ------------------- Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804