Troubleshooting the packages check

When running the Intel® Cluster Checker packages check, you encounter a diagnostic message that looks similar to the following:

Installed Packages, (packages).........................................FAILED
subtest 'dapl-2.0.15-4.el5' failed
- failing host mycluster returned: 'unexpected'
subtest 'dapl-2.0.13-4.el5' failed
- failing host mycluster returned: 'missing'


The packages check reports discrepancies found between the set of RPM packages installed on the system versus the RPMs that are expected to be installed. The check uses two files containing the list of RPMs that should be installed on the head node and the compute nodes, respectively. These files are specified in the packages configuration block of the configuration input file for Intel® Cluster Checker:

<packages>
<head>/etc/intel/clck/head_node_packages.list</head>
<node>/etc/intel/clck/compute_node_packages.list</node>
</packages>



If an RPM package is installed on a node and that RPM is not listed in the file, the check reports an ‘unexpected’ diagnostic message. If an RPM package listed in the file is not installed on the node, the check reports a ‘missing’ diagnostic message. In the example above, mycluster is the cluster head node. The output from Intel® Cluster Checker indicates that the dapl-2.0.15-4.el5 RPM is installed on the node, but that RPM is not included in the file head_node_packages.list. The output also shows that the dapl-2.0.13-4.el5 RPM was not installed as expected.

The list files are a record of the software state of a certified, compliant Intel® Cluster Ready solution.  While a cluster may be functioning, or appear to be functioning, correctly, a failing packages check indicates the cluster is not an exact copy of this certified Intel® Cluster Ready solution and is potentially not compliant with the Intel® Cluster Ready specification.

If the error appears during verification of a cluster installation, check that the software installed matches the software listed in the software bill of materials (including the version.)  If the difference is only in the version of the software installed and those differences are intentional, verify that the update complies with Intel® Cluster Ready incremental software version allowances. Briefly, existing software packages can be updated without re-certification if the updates are only minor version increments with the intention that these increments only provide fixes or enhancements to the existing software and do not add or modify capabilities and features. In this case, refer to the Intel® Cluster Checker User’s Guide for information on how to update the packages list files. Re-run Intel® Cluster Checker with the new files and ensure the packages check passes with the updated RPM lists. 

If the software version differences were not intentional or do not comply with the incremental software version allowances, resolution may be to be re-install the cluster using the exact versions specified or replace the identified RPM packages with the proper version using the mechanisms of the cluster provisioning system.  Updating software on clusters should always employ the mechanisms of the provisioning middleware to ensure proper installation. Manually installing RPMs on nodes may work initially but later lead to issues as nodes are replaced or re-provisioned.

The packages check may also report failures on a cluster some time after the initial deployment of the system. This can result from an update to the software initially installed on the cluster. Similar to the first installation, this may reflect a desired change to the cluster software stack, but it also indicates that the system is potentially no longer Intel® Cluster Ready compliant. If the update is desired, first make back-up copies of the existing RPM list files, then update them to reflect the new state of the software installation. It is also strongly recommended to verify the system is still Intel® Cluster Ready compliant. Refer to the Intel® Cluster Checker User’s Guide for more information on how to check if a cluster is compliant.

Теги:
Пожалуйста, обратитесь к странице Уведомление об оптимизации для более подробной информации относительно производительности и оптимизации в программных продуктах компании Intel.