How to achieve an Intel® Cluster Ready Certification with Intel® Cluster Checker v3.0 on heterogeneous clusters

This article applies only to Intel® Cluster Checker 3.0. For version 3.0.1, the procedure has changed. 

This article describes the steps that are necessary to achieve an Intel® Cluster Ready Certification using Intel® Cluster Checker v3.0 on a heterogeneous cluster. In a heterogeneous cluster, the hardware or software configurations of all compute nodes are not identical. Compute nodes having an identical configuration can be grouped into subclusters.

For general information about the Intel Cluster Ready certification instructions, please refer to the Intel® Cluster Ready Certification Instructions with Intel® Cluster Checker v3.0 document.

On a heterogeneous cluster the following steps are necessary for achieving an Intel Cluster Ready certification.

  1. Create a node list which contains all cluster nodes (as per Intel Cluster Ready Certification Instructions with Intel Cluster Checker v3.0). It might be necessary to create a separate data provider configuration file for the head node.

  2. Edit the data provider configuration file clckd.xml, so it complies with the Intel® Cluster Ready architecture specification requirements (as per Intel® Cluster Ready Certification Instructions with Intel® Cluster Checker v3.0). It might be necessary to create a separate data provider configuration file for each subcluster.

  3. Create a node list which contains just the head node

  4. Create node lists for each subcluster.

  5. Run the data collection using the node list with the head node only

    clck-collect -a -f <node list with head node only> [-c <data provider configuration file for the head node>]
    
  6. Run the data collection on each subcluster separately
    clck-collect -a -f <node list for a subcluster> [-c <data provider configuration file for a subcluster>]
    
  7. Run the data collection for the "all_to_all" data provider using the node list with all nodes
    clck-collect -m all_to_all -f <node list with all nodes> [-c <data provider configuration file for the head node>]
    
  8. Add the following code to the analyzer configuration file clck.xml, since those data providers can't produce results for a single head node. Don't forget to change "HEADNODE" to the name of your head node. 
      <suppressions>
        <suppress>
          <id>hpl-data-missing</id>
          <node_id>HEADNODE</nod_id>
        </suppress>
        <suppress>
          <id>mpi_internode-data-missing</id>
          <node_id>HEADNODE</node_id>
        </suppress>
        <suppress>
          <id>imb_pingpong-data-missing</id>
          <node_id>HEADNODE</node_id>
        </suppress>
      </suppressions>
    
  9. If some of your subclusters are equipped with Intel® Xeon Phi™ coprocessors please add the following suppressions in order to suppress Intel® Xeon Phi™ coprocessor related message for the nodes that are not equipped with  Intel® Xeon Phi™ coprocessors. Don't forget to change "NODE" to the name of the node and repeat the entries for each node. (Also, don't forget to uncomment the miccheck, micinfo and offload_phi checks)
        <suppress>
          <id>miccheck-data-missing</id>
          <node_id>NODE</node_id>
        </suppress>
        <suppress>
          <id>micinfo-data-missing</id>
          <node_id>NODE</node_id>
        </suppress>
    
  10. Run the analysis.
    clck-analyze -c <your analyzer configuration file> -f <node list with all nodes>
    
  11. Follow the instructions for submission in the Intel® Cluster Ready Certification Instructions with Intel® Cluster Checker v3.0 document.

有关编译器优化的更完整信息,请参阅优化通知