Determine the Correct Interconnect Technology for an HPC Cluster

Submit New Article

Last Modified On :   July 12, 2007 9:47 AM PDT
Rate
 



Challenge

Determine the type of interconnect technology required for a particular High-Performance Computing (HPC) cluster. The typical choices for cluster networking today are:

  • 100BASE Fast Ethernet over copper or fiber
  • 1000BASE Gigabit Ethernet over copper or fiber
  • Ethernet with TOE (TCP Offloading Engines built into the network adapter)
  • Scalable Coherent Interface (SCI), such as from Scali
  • Myricom Myrinet
  • Quadrics
  • InfiniBand Architecture

 


Solution

Base the decision on whether your applications are fine-grained or coarse-grained. Answering this question helps you decide on the networking technology the cluster requires. Granularity refers to the ratio of compute time to communication time. In parallel computing, independent processes on individual nodes perform their respective computations and then coordinate and synchronize their results by communicating over the network.

Coarse-grained applications spend more time computing than communicating; fine-grained applications must communicate often. Coarse-grained algorithms, such as you might find in single-channel seismic codes and rendering programs, require less communications than fine-grained algorithms: for example, applications using unstructured mesh codes. Very coarse-grained applications that compute most of the time and communicate very little are referred to as embarrassingly parallel (or naturally parallel) applications.

Consider a rendering application running on a cluster, such as digital imaging software for computer-generated imagery in the motion-picture industry. Here, each node is given a frame of the image to generate. Each node spends most of its time rendering the content. When the render is complete, it coordinates the results with other nodes, passing the results to a node that combines all the frames into the final scene. Because the application spends most of its time computing and little time communicating, it is very coarse-grained.

In a cluster, the hardware must meet or beat the requirements of the software, including the networking technology. Since a coarser-grained application depends less on communications, it does not need a high-performance network. 100BASE-TX Ethernet and a high-latency networking protocol, such as TCP/IP, might provide the necessary network speed to keep the applications running at peak performance. In the case of the rendering application example above, even 10BASE-T Ethernet might suffice in such an embarrassingly parallel application. However, Ethernet might become the bottleneck for fine-grained applications.

The performance of a fine-grained application that communicates often will suffer if it must wait on the network. When the network fails to deliver the performance that the application needs, the application is communications-bound or communications-limited. Fine-grained applications require fast interconnects, such as Myrinet with its Grand Message protocol or InfiniBand Architecture.

Interconnect performance directly affects the performance of the cluster as applications become finer grained. Two characteristics are important to consider: the latency between packets and the maximum bandwidth achievable on the network. High-latency protocols, such as TCP/IP, can degrade cluster performance as much as a network with limited bandwidth can. For fine-grained applications, the network needs to have low latency and high bandwidth. The table below compares the key performance characteristics of different network technologies:


Source

Building High-Performance Computing Clusters with Intel® Architecture, Part 1