Why isn’t my question answered in this FAQ?
This FAQ attempts to highlight a subset of the publicly available multi-core information at Intel.com. It is not intended to be a comprehensive guide to all multi-core-related material that Intel has published. See the next item for tips on finding answers elsewhere. If you’d like to suggest an update to this FAQ, please contact support or post your software related multi-core questions directly in the Threading on Intel Parallel Architectures discussion forum. By posting your question in the Intel® Developer Zone Forums, you benefit from the knowledge of many people while the community benefits from the discussion.
I didn’t find the answer to my question in this FAQ. Where should I look next?
Start by searching Intel.com. It’s possible to search the subset of the site relating to software development by using the advanced search (http://www.intel.com/content/www/us/en/search.html). Another option is searching the Intel Developer Zone site using Google. To do this, add “site:software.intel.com” to a Google search. Next, search the Intel® Developer Zone Forums. The search box is at the bottom of the page. If the question hasn’t been answered, post it in the appropriate forum or contact Intel Customer Support.
How do I report a change in the status of an FAQ item?
As Intel continues to ramp multi-core architectures across its product lines and developers become more adept at writing multithreaded code, this FAQ will become dated or obsolete. If you believe that an item is no longer valid or in need of updating, please contact Intel® Developer Zone support. The entire developer community appreciates your assistance in keeping this FAQ current.
Where should I post or send questions that are not related to multi-core architecture or threading?
Intel Developer Zone Forums offer a place for public questions and answers on Intel Software Development Products, Intel platforms and technologies and other topics. Intel engineers participate and provide answers in the forums. Another option is contacting Intel® Customer Support.
What is multi-core architecture?
Explained most simply, multi-core processor architecture entails silicon design engineers placing two or more execution cores, or computational engines, within a single processor package. This multi-core processor plugs directly into a single processor socket, but the operating system perceives each of its execution cores as a discrete logical processor with all the associated execution resources. Multi-core chips do more work per clock cycle, are able to run at a lower frequency, and may enhance user experience in several ways such as improving performance of compute- and bandwidth-intensive activities.
What is the difference between multi-core architecture and Hyper-Threading (HT Technology) Technology?
HT Technology allows more efficient use of a single execution core by allowing multiple threads to share the core’s resources, whereas multi-core capability provides two or more complete sets of execution resources to increase compute throughput. Any application that has been threaded for HT Technology should deliver great performance when run on an Intel multi-core processor-based system. Accordingly, users will be able to take advantage of many existing applications that are already optimized for two threads from the earliest days of Intel’s transition to multi-core architectures across its desktop, mobile and server processor product lines. Multi-core architecture doesn’t signal the death of HT Technology. Instead, HT Technology makes it possible to further maximize performance by splitting the resources of each individual core in the multi-core system.
What is the impact of multi-core architecture on licensing? Is it going to increase software licensing costs? If so, for which application, operating systems (OSes) or databases?
Software licensing decisions are the purview of individual software vendors. As a user and buyer of software, Intel believes in influencing licensing models through its purchasing negotiations and decisions. The company is advocating for the industry to continue licensing aligned with measurable software usage and business value by urging vendors to base licenses “per socket” rather than “per core” when software licensing is based on underlying hardware.
Intel’s strategy is to maintain industry platform pricing models. Intel believes multi-core architecture is a logical evolution of Hyper-Threading Technology and the industry’s history generally runs counter to dramatic increases in total solution (hardware plus software) costs. People have come to expect faster, cheaper and better products.
The good news is that many major software vendors, including Microsoft*, Red Hat and others, agree with Intel and have already announced per-socket licensing policies – treating a multi-core processor as a single CPU.
Is there any downside to multi-core architecture? Is it possible that my application will run slower?
Generally, any application that will work with an Intel single-core processor will work with an Intel multi-core processor. Multi-core platforms provide the next generation of performance, cost-efficiency and business value. For threaded applications, there is only upside. There are some exceptions to the rule such as well-threaded memory-intensive applications where memory requests can easily exceed the capacity of memory subsystem, which will diminish the performance of the application. Intel understands that not all applications – or more precisely, important operations within a given application – are amenable to parallelization or threading. In these cases, developers should encourage their customers to look for different usage or deployment models, make use of greater multi-tasking, or look for additional solution features or functionality that utilize the additional cores.
Why is Intel implementing multi-core architectures across its product line?
Through Intel’s ongoing research and development efforts at Intel, the doubling of transistors every couple of years has been maintained for forty years. But scaling out could not continue indefinitely because of several, less-friendly laws of physics. A 1993-era Intel Pentium Processor had around 3 million transistors, while today’s Intel® Itanium® 2 processor has nearly 1 billion transistors. If this rate continued, Intel processors would soon be producing more heat per square centimeter than the surface of the sun-which is why the problem of heat is already setting hard limits to frequency increases. These challenges and others- power, memory latency, resistance-capacitance (RC) delay and scalar performance – are described in the 2004 Technology@Intel Magazine story titled, “Architecting the Era of Tera.” [PDF 298KB].
Despite these challenges, users continue to demand increased performance. Today, there are more than half a billion PC users worldwide. Home users rely on PCs for delivery and creation of rich media content placing new demands on applications for encoding and decoding multimedia filed for editing video. Business users of software for 3D modeling, scientific calculations or high-end digital content creation have their own growing list of performance-heavy requirements. In both cases performance means more than wringing additional benefits from a single application because users commonly multi-task, actively toggling between two or more applications or working in environments in which many background processes compete for scarce processor resources.
Multi-core processors are the next innovation in Intel’s continuing commitment to enhancing computing architectures and platforms based on what people want and how they use technology. Intel believes multi-core platforms will empower the development of new applications that will enable wide-ranging advances in everything from medicine to IT, from the digital office to the digital home, from mobility solutions to the latest games. The company plans on bringing the benefits of its multi-core platforms to all its targeted segments: desktop and mobile PCs as well as servers and workstations.
How committed is Intel to multi-core architectures?
Very. Multi-core processor capability is central to the Intel platform-centric approach. The company continues to invest in the Intel® Tera–scale Computing Research Program, an research and development effort to scale today's multi-core process up to designs that have tens or hundreds of energy-efficient cores with teraflops of compute capability. At the Fall 2006 Intel® Developer Forum, Intel senior fellow and director of the company’s corporate technology group, Justin Rattner, announced the company had developed an 80-core processor and was continuing to work on resolving data traffic, heat and latency issues and to determine ways to these chips can run existing software and operating systems. In the marketplace, the move to threading and concurrency in software likely will be pulled along by the rapid transition to multi-core architectures.
If I do nothing, will I benefit from multi-core architecture? How? Or will my application’s performance suffer?
The answer to these questions depends on any number of scenarios. As a result of Intel’s efforts to drive SMP (symmetric multiprocessing) and Hyper-Threading Technology over the past decade, most mid-tier and back-end applications already architected for multi-processor servers are highly threaded today – particularly database, application and Web servers. For these applications, the question is essentially moot, although opportunities still may exist to improve threading performance and correctness.
Since multi-core processors can execute completely separate threads of code, a number of different usage models are possible. One thread can run from an application, a second thread runs from an operating system. Two applications can execute simultaneously. Concurrent workloads can be virtualized or a fault isolation and failover implementation can be employed. Additional threads can be utilized for background applications such as virus protection, security, compression, encryption and synchronization. Multi-core processors can be used to facilitate even more effective server consolidation.
But the answer to the questions remains “it depends.” Unthreaded applications certainly will run on a multi-core processor if you do nothing. The performance of an individual unthreaded application or workload on a single core of a multi-core processor essentially will mirror its performance on a single-core processor with equivalent clock speed, cache size and architecture, and front-side bus and I/O capabilities.
If you’re running an unthreaded application, the application can only make use of one core. When you thread the application, there should be some speedup as the operating system and the application run on separate cores. But Intel says that actual performance gains c annot be projected easily. The company does stress, however, that relying on straight-line execution flow for significant application performance gains in the long-term simply will not bear the fruit it has in the past. Investment in concurrent programming is the way to boost performance today and in the future.
Is any operating system (OS) particularly adept at benefiting from multi-core architecture? Which one(s)? Why?
All major OSes, including Mac OS X, Microsoft Windows Vista*, Windows Server*, Red Hat Linux*, and Novell SuSE Linux*, already are threaded to take advantage of Hyper-Threading Technology and now multi-core architecture. There’s a long history of broad OS support for existing scalable platforms. Thus, any OS which is SMP-capable can schedule single applications across multiple cores. Many OSes already support anywhere from 32 to 64 cores today. As a result, OS support it not an obstacle.
What applications are good candidates to be moved from serial to multithreaded to experience performance gains on multi-core systems?
Any program that is in a class of applications where threading is already relatively common – video encoding, 3D rendering, video/photo editing and high performance computing/workstation applications. These applications are especially amenable to thread-level parallelism because many of their computations can run simultaneously.
In the case of gamming applications, threading has the potential to be immensely beneficial.
Several video game vendors, including Remedy Entertainment, Valve, Epic Games and Ubisoft Entertainment, have already released multi-threaded versions of popular gaming titles and plan on threading future titles to take advantage of multi-core processors.
For example, Valve is integrating full support for multi-threading including the Source game engine and developer tools. Eventually all of the gaming company’s products will support multi-core technology and Valve plans to backport it to older games. Valve president and co-founder Gabe Newell told Tech Report, “Quad-core will change every aspect of PC gaming. It will change how we create our games, how we provision our service, and how we design our games. The scalability we've seen in graphics over the last few years will now extend to physics, AI, animation, and all the systems which are critical to moving beyond the era of pretty but dumb games.”
However, Intel stresses that developers should not look at the question of performance from a single-application perspective. Different usage and deployment models, such as those outlined in the answer to "If I do nothing, will I benefit from multi-core architecture?" (above) and "Besides using threaded applications, when else might end users experience a performance gain on multi-core systems?" (below), can provide equal or greater end user benefits and experience in a multi-core environment.
Besides using threaded applications, when else might end users experience a performance gain on multi-core systems?
Multitasking users or those working in environments marked by lots of background processing also should benefit from multi-core systems. Behind-the-scenes processing is increasingly the norm in business computing environments. Examples include users who run background data mining queries while working o n other tasks in the foreground or corporate IT departments that unobtrusively update software, troubleshoot hardware or perform virus scanning and other management tasks over the corporate network.
Do I have to become a threading expert? How?
If you have been developing applications for multi-processing environments already – in the enterprise arena, for example – then you or your co-workers may already have the required skills. For others, this is arguably a major industry transition and requires a fundamental shift towards concurrency in application development.
The fact is all major processor chip manufacturers are moving to multi-core architectures. The days of boosting clock-speeds to increase straight-line instruction throughput are winding down. Intel believes developers will reap long term rewards from their efforts to develop threading expertise today. The company has been working with operating system and application vendors during the past decade to optimize and enhance the threading capabilities of their software. The Intel Software and Solutions Group (SSG) Threading Enabling Program predates the introduction of Hyper-Threading Technology. This program is expanding further to support this software transition. SSG offers threading tools, compilers, and other performance-tuning toolkits; whitepapers; and other technical resources to help software developers implement thread-level parallelism enhancements in their code. See How Intel Can Help (below) for more information.
How much work/time/effort will it take me to thread my application?
Fair warning: thinking in terms of concurrency and threading, especially for the uninitiated, is difficult. Threaded software has the same potential for bugs as serial code. However, there are additional problems that can crop up. The most common of these is known as a ‘race condition,’ where multiple threads update the same memory location at the same time, but the order and timing of each thread's execution can change the results from one run to the next. Also, code that’s properly threaded doesn’t always perform faster on a multi-core system. This happens when the threads are re-serialized because they share a dependency on a single resource, such as I/O, memory or network throughput. Despite these challenges, resources are available at Intel to help. For example, Intel® Threading Analysis Tools save time by identifying hard-to-find threading errors and performance bottlenecks.
How can I determine if I should thread my application (or not)?
There are several aspects to considering when deciding if an application should be threaded. An important first step is formally identifying highly threaded application designs both as a near-term and long-term goal. In doing so, consider an application’s ability to handle changes, which typically increase in system resources or data set sizes. The number of cores available in the future will only increase an d threading will minimize latency, boost an application’s scalability and make code more flexible. Retrofitting parallelization into a product’s architecture is likely to be more expensive in the long run than using an architecture that lends itself to multi-threading from the earliest design stages. By weighing an application’s timeframe for completion against its future performance on multi-core hardware you can determine if threading an application is appropriate.
In addition, consider how many execution cores a given piece of software intends to support. If an application is architected specifically to support up to a certain number of cores, it may need to be redesigned once the mainstream machines that support it have moved significantly beyond that number of cores. By making the upper limit on the number of threads that the application can create configurable, you can change the allowable number of threads in the future as hardware evolves.
When comparing the performance of applications, both serial and parallel, the bottom line is wall-clock execution time. Critically analyze what segments of code or operations can run in parallel and thus maximize performance. Epic Games founder and president Tim Sweeney told AnandTech that when his company began multi-threading its products developers focused on physics, animation updates, the renderer's scene traversal loop, sound updates and content streaming. The company didn’t attempt to multi-thread systems that were highly sequential and object-oriented, such as the gameplay. Sweeney added, “It's especially important to focus multi-threading efforts on the self-contained and performance-critical subsystems in an engine that offer the most potential performance gain. You definitely don't want to execute your 150,000 lines of object-oriented gameplay logic across multiple threads - the combinatorical complexity of all of the interactions is beyond what a team can economically manage. But if you're looking at handing off physics calculations or animation updates to threads, that becomes a more tractable problem.”
Next, characterize the difficulty of parallelizing specific workloads by determining the amount of developer effort involved in creating the threaded version. Workloads may be seen as fitting into one of three broadly defined categories:
Easily threaded workloads: Problems which imply an obvious threading model and constitute about 10 to 20 percent of all workloads.
Moderately difficult-to-thread workloads: Workloads that can be parallelized with substantial effort, which is warranted by potential performance gains to protect a competitive advantage such as some database applications, data mining, synthesizing, text and voice processing. These tasks constitute 60 percent of workloads.
Very difficult-to-thread workloads: Workloads that are very difficult to parallelize due to linear arrangements where the input data of one subtask is generally dependent upon the output data of another. The business of threading such workloads must be carefully considered against the cost and technical complexity of doing so.
If your application is network intensive, also consider the ratio of application CPU time to network time and whether or not the network stack for your OS is already parallelized as outlined in the paper title “Critical Analysis of the Need for Parallelizing Network Stacks,” by Intel Network Research Scientist Annie Foong and Intel Network Software Engineer Erik J. Johnson.
Once the above issues have been addressed, consider available tools and resources that can help speed up or simplify the multi-threading process. The Threading for Multi-Core Developer Community on the Intel Developer Zone offers several papers including, “ Writing parallel programs: a multi-language tutorial introduction” offer high level characteristics of different notations in sufficient detail so programmes can make an intelligent choice of which notation to invest one’s time in mastering.
Does Intel believe every application should be threaded?
No. Intel is rapidly moving to multi-core architectures across all product lines and makes available many resources for developers interested in threading their applications. The company says that, in the future, the biggest performance gains are to be had from architectural innovations and not necessarily from increasing clock speeds. However, the threading cost-benefit analysis is left to developers. Applications that are not CPU-bound and don’t need to scale for increasing performance, reliability and security may not be good candidates for threading.
I've decided it makes sense to thread. Now what?
Register for the Intel Developer Zone, which provides technical articles and code samples; strategic articles by Intel executives and other industry luminaries; Web-based training sessions, face-to-face courses, and Webcasts; software products for evaluation and purchase, SDKs, and tools; and community forum discussions with experts and peers on development products, platforms, and technologies. Registration for the Intel Developer Zone – which features lots of threading-related content – is free and allows for access to all Developer Centers content, a bi-weekly newsletter, the Solutions Catalog and a personalized home page.
Is there a process/checklist I can use to get started in learning about threading?
The Intel® Parallel Programming Community provides technical information, tools, innovation & support from industry experts. Learn how to best develop parallel programs and multi-threaded software on Multi-Core and Multi-Processor platforms.
For university faculty, the Intel® Academic Community-the one-stop-shop for training on Intel’s software technologies-offers a comprehensive threading curriculum delivered online and through classroom lectures. Courses include introductions to multi-threaded programming, tools for threading, advanced thread programming and threading specific applications.
Where is the multi-core Developer Center on the Intel Developer Zone?
Taking advantage of multi-core chips simply requires writing multithreaded applications, and the Intel® Developer Zone does maintain a Parallel Programming & Multi-Core Development Community page.
How do I get support in threading my code to take advantage of multi-core architecture?
Anyone can submit general technical support questions via e-mail by visiting Intel Customer Support. Registered users of the Intel Developer Zone (registration is free) can post questions in the Intel Developer Zone Forums; Intel engineers participate and provide answers in the forums. Developers who purchase Intel® Software Development Products or become an Intel® Software Partner are eligible for Intel Premier Support, which provides more advanced and personalized troubleshooting and support by e-mail. Real Solutions for Businesses like Yours provides fee-for-service transition guidance for multi-core processors tailored to a customer’s unique business and IT environment.
Where are the multi-core or threading discussion forums?
Intel Developer Zone Forums offer discussions on Intel Software Development Products, Intel platforms and technologies and other topics. Intel engineers participate and provide answers in the forums. The Threading on Intel Parallel Architectures forum is well-trafficked. and addresses the challenge of developing OS- and application-level threaded applications for server and desktop environments.
Are there any books available explaining how to code for multi-core architecture?
Multi-Core Programming, a 2006 Intel Press book by Shameem Akhter and Jason Roberts, helps software developers write high-performance multi-threaded code for Intel's multi-core architecture while avoiding the common parallel programming issues associated with multi-threaded programs. Intel has been working with leading software vendors for more than a decade to deliver thread-optimized code: first for multiprocessor platforms, then for Hyper-Threading Technology and now for multi-core architecture.
What Intel Software Development Products can help to create threaded software?
Intel® Thread Checker V. 3.1 (and Intel® Thread Profiler V 3.1) locate threading bugs and threading performance bottlenecks. Other tools that may be useful in creating multithreaded code – including analyzers that support overtime sampling of threads, compilers that support the OpenMP* threading model, and performance libraries that themselves are threaded for performance – are described at the Intel Developer Zone Parallel Programming Web page.
Multi-Threading for Experts: Synchronization
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804