by Matt Gillespie
The 64-bit Intel® Xeon® processor extends the choices for organizations developing and deploying enterprise applications based on Oracle Database* 10g. This architecture joins the Intel® Itanium® Processor Family as a compelling value proposition for large-scale data-driven applications. Each of these processor families provides outstanding performance, as detailed on Intel's performance Web page. This article provides decision makers and developers working with Oracle solutions with the background to differentiate functionally between the two types of 64-bit Intel® architecture, as well as to take the next steps toward deciding upon a deployment platform.
The scalable 64-bit Intel Xeon processor MP-based server platform with the Intel® E8500 chipset provides faster system responsiveness and performance than the previous generation. In addition to enabling servers to run 64-bit operating systems and applications, breaking through the 4GB memory limitation that constrains 32-bit platforms, systems based on the 64-bit Intel Xeon processor also run 32-bit applications natively, on the same server as 64-bit applications. This flexibility allows a single machine to meet a wide variety of needs, supporting each with high performance and scalability.
The Intel Itanium®-based systems continue to be the platform of choice for migration from proprietary RISC platforms, and they are Intel's premiere platform for the largest and most demanding enterprise solutions. The Itanium Processor Family's explicitly parallel architecture, massive caches, and other heavy-duty processing capabilities will continue to make it Intel's highest-performing and most reliable server platform for many years to come, for the very largest enterprise workloads. Deciding between the 64-bit Intel Xeon and Itanium processor platforms requires an understanding of the architectural advantages of each, as well as the ability to test solutions on each platform. The tools and support provided by Intel help you get the best performance possible out of either platform.
Oracle Database 10g: Built on and for Intel Architecture
Very early in the design cycle for Oracle Database 10g, teams from Intel and Oracle began working together to tune the platform for 64-bit Intel® architecture. These same teams had achieved substantial performance gains during the development of Oracle 9i*, in part through the use of the Intel® Compilers. Building on those performance gains, the teams applied substantially more aggressive optimizations with the compilers, pushing performance higher with ever-growing workloads. That process is ongoing, and it continues to build performance with each successive Oracle release.
One key component of tuning Oracle 10g with the Intel Compiler was the use of profile-guided optimization (PGO), which involves instrumenting the code to allow gathering of information during execution about how often each piece of code is accessed. This process allowed frequently accessed passages of code to be placed in close proximity to each other. That placement substanti ally reduced code branching and shrank code size, allowing better use of cache resources than would otherwise be possible. As a result, the development teams achieved substantial increases in the efficiency of instruction fetching, contributing to overall performance gains.
The Intel and Oracle teams also achieved performance gains during the development of Oracle 10g by means of interprocedural optimization (IPO) on the Intel Compilers. IPO creates information files about each program module to determine potential benefits that can be achieved by inlining functions. By placing functions inline to their calling location, the teams increased the placement of related functions close together, further reducing code branching, as well as decreasing the need to set up parameters for function calls.
Another significant area of optimization and development in the Oracle 10g lifecycle has been Linux*-specific work on such areas as huge-memory page sizes. The ability to take good advantage of huge page sizes in 64-bit systems, which tend to have large amounts of system memory, is a substantial contributor to overall system performance. Intel provided optimization assistance in support of Oracle's continuing pioneering work in this and other areas, including the development of loaders, drivers, and a scheduler that benefit Oracle users as well as the Linux community in general.
Because Oracle 10g optimizations targeted Intel® processors at each phase of the development lifecycle, high performance on Intel architecture is literally an integral part of the database design. Compiler optimizations such as those achieved by PGO and IPO have had the further benefit of making those optimizations controllable at compile time. That degree of control enables Oracle to apply separate, processor-specific optimizations to different versions of the database platform. Thus, it is relatively simple for Oracle to provide individually optimized distributions for different Intel processor platforms.
Oracle Workload Advantages on the 64-bit Intel Xeon Processor
The Oracle development teams worked to enable high performance on Intel® Extended Memory 64 Technology (Intel® EM64T), which underlies 64-bit Intel Xeon processors, long before those processors were introduced. The result is that Oracle 10g is tuned specifically for this platform, providing very high levels of performance for Oracle-based applications in the enterprise. A particular focus of this development effort was to enable optimal utilization of Intel EM64T on very large Oracle workloads, including those in the context of grid computing.
The ability of 64-bit Intel Xeon processors to natively access memory in excess of the 4GB limit put in place by 32-bit processors enables this platform to eclipse the performance of previous generations of Intel Xeon processors. The 64-bit Intel Xeon processor MP can access up to one terabyte of memory, allowing it to place massive amounts of data in on-board memory, where the system can access it without recourse to swap or paging files. The fast access to that data afforded by this increase in system memory vastly increases the performance of the system. These processors also have very high clock speeds and fast front-side buses that further enhance the a bility of the platform to meet the needs of enterprise database applications.
Database applications are highly threaded implementations that benefit well from parallel execution resources. Like its predecessors in the Intel Xeon processor family, the 64-bit Intel Xeon processors support Hyper-Threading Technology1 (HT Technology), which enables simultaneous execution of two threads by a single physical processor. By exposing two logical processors to the operating system, each of which works on its own thread, the processor is able to substantially reduce idle resource time, improving performance. This hardware optimization works together with the optimization of the Oracle 10g platform and of individual software applications that execute on top of that platform to create integrated solutions that generate optimal performance.
The emerging generation of multi-core Intel Xeon processors advances the parallelization afforded by HT Technology to the next level. Dual-core processors place two separate execution cores on a single processor die. Thus, while the processor still fits into the same standard motherboard socket as previous generations, it contains two completely separate sets of execution resources, each of which functions independently of the other. Furthermore, each of those cores can support HT Technology, which creates an extraordinary processing engine for the needs of Oracle-based applications. Future generations of 64-bit Intel Xeon processors will incorporate more than two processing cores on a single processor die, which promises increased headroom for the future.
1Hyper-Threading Technology requires a computer system with an Intel® Pentium® 4 processor supporting HT Technology and a Hyper-Threading Technology enabled chipset, BIOS, and operating system. Performance will vary depending on the specific hardware and software you use. See http://www.intel.com/products/ht/hyperthreading_more.htm for more information including details on which processors support HT Technology.
Itanium-Based Servers Support the Largest Oracle Implementations
For massive Oracle based implementations, as well as the migration of solutions from proprietary RISC architectures, the Itanium Processor Family provides mainframe-scale execution resources with extraordinary scalability, reliability, and return-on-investment profile. The Oracle 10g development teams carried out many optimizations for the Itanium architecture, particularly with regard to improving the efficiency of memory management and branch prediction. Such advances add to the native robustness of the silicon platform to support the largest workloads with premium levels of performance.
In addition to being able to access massive amounts of memory, the Itanium processor has very large L1, L2, and L3 caches, a high-speed front-side bus, and it is designed from the ground up to operate in systems based on large numbers of processors. The Itanium 2 microarchitecture is based on Explicitly Parallel Instruction Computing (EPIC), which allows the simultaneous execution of up to six instructions, in two bundles of thr ee instructions each. These characteristics give it the ability to execute instructions at a very high rate, which corresponds to high performance, particularly in the highly threaded environments that are common on Oracle-based applications.
Other key architectural characteristics that add to the ability of Itanium-based systems to achieve high performance on Oracle workloads include the following:
- Predication. The Itanium processor has the capability to implement conditional execution using a mechanism known as predicated instructions. This capability enables the processor to execute both branches of a piece of code and to discard the result that is not needed. By selectively applying predication, the processor effectively reduces the number of branches in a piece of code, removing the performance deficits associated with having to wait for a new sequence of instructions to pass all the way through the execution pipeline in cases of branch mis-prediction.
Even though predication necessarily requires execution of computations whose results are discarded, the Itanium architecture performs it as a 'zero-cost' operation. Predication is a powerful mechanism for reducing the usual performance deficits associated with very branchy code such as that typically associated with database applications. Moreover, since it effectively reduces the number of branches in the application, predication makes larger basic code blocks, which also contributes to higher performance.
- Branch prediction. Another mechanism that processors use, branch prediction, consists of estimating the most likely outcome of code branches and collecting all of the instructions that will be needed after the branch executes in an instruction cache. Assuming that the prediction is correct, the processor has access to all of the instructions that are needed after branch execution, and it can proceed more swiftly. The accuracy of branch prediction is therefore very important to application performance.
The Itanium microarchitecture supports communication of branch information from the compiler to the processor. This capability enables branch prediction to be more accurate, as well as reducing the performance deficits associated with remaining branch mis-predictions. Oracle and Intel engineers worked together closely to tune the implementation of compiler-to-processor communication in Oracle 10g, another substantial performance measure, which complements predication.
- Speculation. By moving costly operations such as cache loads out of their normal sequence during program execution, speculation enables the processor to complete these operations in advance of when they are needed. Although these operations still take the same amount of time that they would otherwise, their outcome is available to support dependent processes sooner. That availability prevents idle processor cycles that would otherwise occur while a portion of the execution engine waited for the speculated operation to complete.
These and other characteristics of the Itanium Processor Family make these processors well-suited to handling the very largest enterprise workloads. For this reason, Itanium architecture is widely deployed in high-performance computing environments at scientific and commercial re search facilities all over the world. The dual-core members of the Itanium Processor Family, which are scheduled to be introduced by the end of 2005, promise to extend the benefits associated with the Itanium processor even further.
Development Tools and Services from Intel
Intel offers an extensive suite of software development tools, including compilers, analyzers, performance libraries, and threading and cluster tools, which enable developers to create world-class software applications. Intel® Software Development Products work with a variety of development environments, operating systems, industry-standard development tools, and leading processor manufacturers. With every purchase of an Intel Software Development Product, users receive one year of technical support and product updates from Intel® Premier Support, an interactive issue-management and communication Web site. This premium support service allows users to submit questions, download product updates, and access technical notes, application notes, and documentation.
The Intel® Software Partner Program allows enterprises to obtain access to emerging hardware platforms before they are introduced to the general public. This program can allow you to test applications on both 64-bit Intel Xeon processors and Itanium processors with various workloads, to gauge performance under laboratory conditions to determine specific sizing requirements for your solutions. These systems are available on a shared or dedicated basis over a secure Internet connection, as well as through on-site installation of test systems.
Just as the Oracle platform itself benefited from collaboration between hardware and software development teams at Intel, many organizations that deploy Oracle applications benefit from consulting services to assist them in moving to 64-bit Intel architecture. Intel® Solution Services is Intel Corporation’s worldwide professional services organization, helping enterprise companies capitalize on the full value of Intel® architecture through consulting focused on architecture transitions. Intel Solution Services uses its foremost expertise in Intel architecture and next-generation technologies, as well as its strong key industry alliances with Oracle and others, to design cost-effective, leading-edge solutions that help deliver superior business results.
Developers and decision makers who work with Oracle-based applications can benefit dramatically from the unprecedented level of tuning for Intel architecture that went into the development of Oracle 10g. The ongoing collaboration between Intel and Oracle will continue to make the most of Intel hardware platforms as they are introduced. Oracle developers should use Intel Software Development Products in the development of their applications and benefit from robust optimization right out of the box, as well as tuning automation from other Intel Software Development products such as the Intel® VTune™ Performance Analyzer and the Intel® Performance Libraries.
Solution and enterprise architects should consider joining the Intel Early Access Program in order to take full advantage of testing on leading-edge platforms, before they are introduced. This access is invaluable to enabling decision-making about which platform is best for a given implementation, as well as for tuning for the latest platform features. Those enterprises that need assistance for the largest challenges should also consider contacting Intel Solution Services, whose engineers are available to put cost-effective solutions to work using the most up-to-date tools, techniques, and architectures.
Intel EM64T and multi-core processing add substantially to the choices available for 64-bit enterprise platforms. The advent of 64-bit Intel Xeon processors provides modern enterprises with an industry-leading, highly scalable platform with very high price/performance and scalability for mainstream solutions. Itanium 2-based servers continue to yield extremely high performance for massive-scale solutions that demand mission-critical performance, stability, and scalability. Intel has announced a roadmap with more than 15 multi-core products in the next few years, which encompass both these product families that will benefit the Oracle platform even more.
The following Web pages provide points of departure for further inquiry into the main topics discussed in this article:
- Strategic Alliance Between Oracle and Intel* ensures that present and future generations of products from both companies will be engineered together for the highest levels of compatibility and performance.
- 64-bit Intel Processor-based Solutions offer compelling business and IT advantages, whether you are migrating your 32-bit applications to 64 bits, or replacing expensive and proprietary RISC-based platforms.
- Intel Software Development Products are a full suite of tools that can help developers easily create faster software on Intel architecture.
- Intel® Multi-Core Processors can be expected to drive a new era of server performance and flexibility, providing platforms that can better handle escalating workloads and rapidly evolving usage models.
About the Author
Matt Gillespie is an independent technical author and editor working out of the Chicago area and specializing in emerging hardware and software technologies. Before going into business for himself, Matt developed training for software developers at Intel Corporation and worked in Internet Technical Services at California Federal Bank. He spent his early years as a writer and editor in the fields of financial publishing and neuroscience.