Intel's Itanium® Processor Family: The Roadmap

Submit New Article

Last Modified On :   May 7, 2008 12:01 AM PDT
Rate
 


by Andrew Binstock, Pacific Data Works LLC


Introduction

According to tests by the vendor-neutral consortium, SPEC (the Standards Performance Evaluation Corporation), the Intel Itanium® 2 processor shattered nearly every record for processor performance, regardless of chip architecture and design. As this second generation of the Itanium® processor family enjoys increasing performance-driven demand from IT, Intel has begun extending its long-term vision for the 64-bit architecture, which will continue with the next generation, Madison (code name). This article discusses this vision, with perspectives from industry analysts.

Why 64-bits At All?

Most desktop PCs today run plenty fast. Hitch nearly any application to an inexpensive machine with a Pentium® 4 processor and you will see it execute brightly and without delay, a result attributable to processor performance. If you are willing to spend more, you can get Intel® Xeon® processor-based systems that will do heavy number crunching and fly through the most demanding desktop applications, or for some workloads be the servers in your datacenter. Given the power of these 32-bit systems and the remarkably low prices at which they're available, why is Intel placing such strategic importance on the Itanium processor family?

One simple and compelling reason is Windows-based multiprocessing servers. 32-bit systems today are physically constrained to access a maximum of 4GB of RAM. For desktop systems or even dual-processing workstations, this ceiling is not much of a limitation. However, on multiprocessing servers running large enterprise level applications, especially those running Windows 2000 Advanced Server* and Datacenter* editions, the 4GB limit is a serious constraint. Windows 2000 Advanced Server (and some versions of Windows XP*) is required for systems that run more than four processors. The difficulty with these advanced versions of Windows* is that they consume approximately half the available RAM for their own internal purposes. So, on a typical multiprocessing server with the full complement of 4GB of RAM, only about 2GB is available to the application.

If the server, sports eight processors, this limitation leaves each processor with only 256MB of RAM to work with. And in the likely case that the processor supports two execution threads via Hyper-Threading technology, then this RAM is further divided. These constraints become significant on servers that are doing intensive transactional processing. This particular aspect has tended to discourage the construction of 32-bit servers with more than 8 processors per server, even though some systems, such as those recently announced by IBM* actually go as high as 16 processors.

The solution to the memory limitation is migration of the application to an Itanium® 2-based system, a 64-bit architecture. With a theoretical maximum of 1019 bytes of RAM addressable by the processor, the system imposes little limitation on the amount of data that can reside in memory, as this figure is on the order of 500,000 times the size of the largest commercial disk-based database currently available. Because of the vast amount of RAM they support, 64-bit architectures have emerged as the preferred design for large servers. The rest of this article describes Intel's roadmap of processors for this architecture.


The EPIC Release of the Itanium® Processor Family

Intel and Hewlett-Packard teamed up in the late 1990s to develop a new 64-bit architecture design for processors. The two companies wanted an architecture that would avoid the problems that limited performance of 64-bit RISC chips, such as those found in traditional UNIX* servers from SUN, Silicon Graphics, and even Hewlett-Packard itself. In particular, they wanted an architecture that was highly optimized for parallel execution of instructions. The idea was to design the chip and its instruction sets so as to execute as many instructions as possible per clock cycle. In addition, the designers wanted to handle jumps (often called branches) efficiently. Their hope was that a jump would not halt the flow of executing instructions as it does on every other processor architecture. These goals eventually led them to revise some early microprocessor design ideas and cast them into the explicitly parallel instruction computing (EPIC) architecture.

EPIC, as implemented in the Itanium Processor Family, executes instructions in parallel in what are known as Instruction Bundles, comprised of 3 instructions each. The processor attempts to schedule two instruction bundles in parallel per clock. This allows the processor to schedule and execute a maximum of 6 instructions in a single clock cycle. Making the Itanium architecture-based system "wider" by far when compared to its 32 bit alternatives results in substantial performance gains via execution throughput increases. The EPIC architecture, however, is a server-oriented design and so, in addition to instruction execution, it pays careful attention to system throughput. In the Itanium 2 processor, there are three levels of on processor cache, with the largest measuring a full 3MB. The system is also equipped with a high bandwidth Front Side Bus (FSB) allowing memory accesses at an amazing 6.4GB/sec, eliminating all instruction and data bus bandwidth based bottlenecks. This bandwidth is so great that it is currently exceedingly difficult to devise any scenario in which this data bandwidth rate can be a bottleneck. Such a high throughput means that on Itanium-based systems, management of large datasets or very large databases is not subject to system bus-bandwidth restrictions.

The EPIC architecture also includes backward compatibility with traditional Intel x86 instructions. This feature enables sites to run legacy 32-bit code that has not been recompiled for EPIC, while migrating their applications to the 64-bit world.

By providing a high degree of instruction parallelism and very substantial throughput, the designers of EPIC are currently delivering one of the fastest 64-bit processors on the market.


Itanium® Processors Roll Out

The first chips in the Itanium processor family rolled out in May 2001; they were called simply Itanium Processor. At the time of their arrival, they established a new performance record for integer computation and were competitive with the fastest RISC chips at the time. This first processor served mostly as an early proof of concept. It enabled hardwar e vendors to design new 64-bit systems and permitted software vendors to have a platform that let them begin porting and testing their applications. Even as a herald of coming technology, however, the Itanium processor enjoyed some remarkable successes. Most notable was the selection by the National Center for Supercomputing Applications (NCSA) of 3,300 Itanium processors in its TeraGrid venture—a joint project by four universities, which in 2003 will deliver the most powerful supercomputer in North America.

Having primed its 64-bit initiative with the original Itanium chip, Intel rolled out the Itanium 2 processor in July 2002. This processor arrived with a bang. Most analysts expected an incremental upgrade from the original instance of the chip, but instead they discovered that the new Itanium 2 architecture ran at a blistering pace that set numerous world records in performance. These records, listed in Table 1, underscore how the performance profile squarely targets enterprise workloads.

Benchmark Test Itanium® 2 Rating Improvement over best RISC processor
Floating-point computation
(SPECfp_base2000)
1356 +13%
Web serving
(SPECweb99_SSL)
1520 +45%
SAP 2-tier benchmark
(run by SAP in its own laboratory)
470 SD users +12%
TPC-C database benchmark
(from the Transactional Performance
Processing Council)
78.4K tpmC +56%

 

Table 1. How the Itanium 2 stacks up vs. RISC chips in standard benchmarks.

Performance marks like these turned heads right away. Market analyst group, Illuminata* commented, "Itanium 2 (processor) cracks skepticism with stunningly good performance. Benchmarks show it runs faster than any other chip on the market for some things, and a close second for others. And this from a chip that's barely arrived and can only improve in the future." Vendors also took note. For example, Dell*, which had originally decided to introduce Itanium 2 servers on a slower track while the processor gained further traction, suddenly reversed course. Within 90 days of shipment, the Itanium 2 processor was the announced highlight of servers from Hewlett-Packard*, Dell*, IBM*, and Unisys*. Smaller vendors also announced support. The servers garnered considerable participation from vendors of operating systems: Microsoft* currently ships a special 64-bit edition of Windows 2000 Advanced Server, and Windows Server 2003* for the Itanium platform, Red Hat* ships an Itanium 2 version of Linux, and Hewlett-Packar d* is actively porting HP-UX* 11 to the chip. The Itanium 2 processor is the only 64-bit processor in history to run four separate commercial operating systems. Between the performance, the wide hardware support, the operating system support, and the large assembly of ported enterprise applications (Oracle* and SAP* among others), it is likely the EPIC architecture will soon be climbing to the top spot in the high performance enterprise market.


Next Generations

The June 30, 2003 release of the next member of the Itanium processor family, the Itanium 2 processor (formerly code-named Madison), and the following one, code-named Montecito extend the roadmap for 64-bit computing. Both processors will focus on maintaining Intel's leadership position in performance, evolving with minor to more significant additions of new features.

The current Itanium 2 processors are fabricated using Intel's 0.18 micron technology. Madison uses the 0.13 micron fabrication, which leads to inherently faster, smaller, cooler chips. These sub-micron measurements refer to the width of the pathways through which the signals must flow when traveling between the individual transistors. By shrinking the size of these paths, the electrons do not travel as far because the entire grid takes up less room. As a result, signals get to their destinations faster and the whole chip can be run at a higher speed. In addition to higher speed, smaller pathways enable more transistors to be placed on the processor die. This extra room will be needed, as in interview with eWeek magazine, Intel CEO Craig Barrett recently stated that Madison will contain 500 million transistors—a record for Intel processors. Most analysts expect a significant portion of these transistors will be dedicated to expanding the on-chip L3 cache from 3MB to 6MB, thereby accelerating performance and throughput. Inevitably, the clock on the Madison processor is projected to be revved to 1.5 GHz as well. This will be followed by Madison 9M processor in mid 2004 with an on-chip L3 cache of 9MB and a further revved clock increase.

The subsequent processor, code named Montecito, is expected to not only increase processor frequency, add more cache, it will also add a new feature called dual-core-which means two separate logical processors on one physical chip. This will result in significant performance gains. Montecito will also be on a new manufacturing technology. When Montecito ships in 2005, Intel would have established a new pace for releases of 64-bit processors. Prior to the arrival of the Itanium processor, the world of 64-bit chips was competitive, but the competition lacked intensity. New releases were brought out without great haste and most chips performed satisfactorily well. The 64-bit servers tended to be sold on the basis of the company behind them rather than the processors within them. Intel—by virtue of being a parts supplier to server vendors-does not have the luxury of this position. As a result, its Itanium architecture is exerting performance pressure on its RISC counterparts. And with a clear commitment to two more generations of Itanium processors in the next two years, Intel is greatly accelerating the pace with which these chips must rev just to stay competitive.

Like other analysts, I believe this competition cannot long endure. The cost o f designing competitive chips is extremely high, as is the expense of fabrication. Only vendors whose core business is processors are likely to be capable of keeping pace with Intel. Of Intel's RISC competitors, only IBM* fits this profile. The fact that the three leading vendors of RISC servers—IBM*, SUN*, Hewlett-Packard*—all now sell 64-bit Itanium 2 processors from Intel suggests that this view is not wholly the province of market analysts. In view of this factor, when the Itanium 2 processor assumes market-share leadership, software vendors who have already migrated their applications to the platform will enjoy an increased advantage over their competitors.

Resources and References

SPEC: all benchmark results are browsable at www.spec.org*

TPC: all benchmarks by the Transaction Processing Performance Council are browsable at www.tpc.org*

Grid Supercomputer Demonstrates Intel® Itanium® 2 Processor Prowess

About the Author

Andrew Binstock is the principal analyst at Pacific Data Works LLC and writes frequently on processor technology. Previously, he was a senior manager at PricewaterhouseCoopers, where he was in charge of the firm's global technology forecasts. He can be reached at abinstock@pacificdataworks.com.