Previewing Intel® Cluster Ready Specification Version 1.2
By Brock A. Taylor
Update Will Address Backward Compatibility, Allow Two-Node Systems
How do you maintain backward compatibility while pushing forward with rapid adoption of the newest technologies? In light of the highly dynamic and ever-changing ecosystem of components that make up cluster solutions, that’s an important question.
An upcoming update to the Intel® Cluster Ready specification aims to give solution providers more control over what earlier capabilities they want to support, and gives application vendors a way to determine what software interfaces the certified solution provides.
Version 1.2 of the Intel Cluster Ready specification will be available in conjunction with the Supercomputing 2010 conference in November. Most requirements will remain unchanged from the current version, but several modifications have been made to the existing requirements in response to community feedback, and we’ve made other changes to improve clarity and organization.
New Flexibility for Backward Compatibility
Intel Cluster Ready defines a cluster applications platform, including hardware and software requirements. However, as newer hardware components get integrated into solutions, many of the software requirements need to keep pace. This means that the minimum versions of libraries included in the original specification begin trending toward “legacy” support. Providing a known and common interface to applications could become a complex task.
To keep the process simple and understandable, version 1.2 introduces the library requirements in terms of a set of versions needed for compliance to a specific version of the Intel® Cluster Ready architecture. In other words, the library requirements become a snapshot of the libraries commonly available in the Linux distributions.
Using this framework, applications developers can target architecture support by versions and solutions providers, balancing the stability of the solution against the desire to provide the latest “bleeding-edge” software support. Organizing the requirements into sets allows the architecture to progress over time while providing a path for backward compatibility.
Therefore, a solution provider must communicate which version(s) of the Intel Cluster Ready architecture are supported by this implementation. Fortunately, the program designers anticipated this communication need and a mechanism is already in place. Solution providers already need to set the CLUSTER_READY_VERSION environment style variable on all compliant systems. As the value of the variable, solution providers simply provide the list of versions of the Intel Cluster Ready specification to which this certified solution complies.
Applications can read this variable by examining the file /etc/intel/icr, parse for the versions supported by the solution, and determine if the application can execute using one of the sets of libraries that exist on the system. In fact, with this method, applications can now select which set of libraries best matches the application’s requirements. Similarly, solutions providers can choose how many sets to include on compliant systems.
Backward compatibility is a complex challenge. With this approach, the Intel Cluster Ready specification targets a model that allows an appropriate level of backward compatibility for platform integrators and ISVs, while preserving the flexibility needed so that solutions can evolve over time.
Deploying on Fewer than Four Compute Nodes – But Still Certifying on Four or More Nodes
Another important change clarifies the minimum number of cluster nodes for an end user compliant solution. With Version 1.2, solutions providers can offer, sell, and deploy certified solutions with fewer than four compute nodes. However, solutions providers will still need to certify the base solution design on systems with four or more compute nodes per sub-cluster.
The rationale for this requirement is that many design issues can be masked by execution on just two nodes. Even four nodes is a small cluster and can hide a set of scaling issues with components, but at four nodes, there is a much greater chance of catching a large set of otherwise hidden design issues and thus a lower risk of having an errant cluster design.
The update will also include a number of changes aimed at improving readability and making the specification easier to navigate. We’re merging and consolidating sections to provide a better grouping of similar requirements, so you’ll see fewer major sections. There are also some additions to the architecture overview and expanded definitions to provide better understanding of the requirements.
New Version of Intel® Cluster Checker
In conjunction with the updated specification, Intel will launch a new version of Intel® Cluster Checker that provides appropriate mechanisms for the updated requirements. You’ll want to be sure to use this new version of the tool to produce the next iteration of your compliant solution, application, or component.
Brock Taylor is an Intel engineering manager, cluster solutions architect, and a co-author of the Intel Cluster Ready Specification. Brock joined Intel in 2000 and has been part of the Intel Cluster Ready program since its inception. He has a B.S. in Computer Engineering from Rose-Hulman Institute of Technology and an M.Sc. in High Performance Computing from Trinity College Dublin.
» Want to learn more from Brock? Check out his blog on ClusterConnection.
Simpson Strong-Tie: Speeding Time-to-Market
Strengthening Product Design with HPC
For Simpson Strong-Tie, high-performance computing (HPC) plays a key role in creating the structural products that help people build stronger, safer buildings. As the company's existing HPC system approached the end of its life, the R&D engineering group sought a new processing platform that could handle more detailed design simulation and deliver faster results without adding IT complexity.
Working with systems integrator Silicon Mechanics, the IT group selected an Intel Cluster Ready system with Intel® Xeon® processors. The new cluster provides more than two times the processing cores in half the data center footprint as the previous system. Deployed in just two days, the new system enables product developers to run complex models and get result two to three times faster than before, while reducing the total cost of ownership. The company is producing more products with reduced time to market. Read the full case study.