CREATIVE ENTHUSIASTS WORKING WITH ENTERTAINMENT MEDIA SUCH AS HD VIDEO NEED HIGHLY RESPONSIVE TOOLS THAT SUSTAIN THE CREATIVE FLOW.
After all, waiting for an effect to render or suffering through herky-jerky video playback are sure ways to squelch inspiration. Therefore, tapping into the full performance potential of today’s desktop, laptop, tablet PC, and mobile computing architectures—and eliminating latency-inducing bottlenecks—is essential for media application developers. These are typically daunting, time- and resource-intensive tasks. Developers need to stay abreast of the latest innovations for a multitude of platforms and maintain separate code paths that are fine-tuned and optimized for each of them.
A number of powerful Intel® developer tools help to streamline the process of analyzing and optimizing media and other graphics intensive applications. For example, Intel® Graphics Performance Analyzers (Intel® GPA), Intel® VTune™ Amplifier XE, and Intel® Media Software Developer Kit (Intel® Media SDK)—used separately or together—allow developers to increase the parallelization of their code, readily identify and eliminate hotspots and bottlenecks, and accelerate media encoding, decoding, preprocessing, and transcoding operations across a variety of Intel® platforms, including legacy and the current 2nd generation Intel® Core™ processor family, as well as Intel® Atom™ processors.
Optimal Performance for Maximum Impact
Optimization is a critical part of the product development workflow, especially for media application developers. ArcSoft, a leading developer of video-editing, conversion, and sharing applications, for example, devotes 50 percent of its development cycle to the optimization process. Why is optimization so important? It all boils down to performance. ”Today’s users don’t want to wait for effects to render or videos to load,” said Yanlong Sun, ArcSoft deputy general manager of Video and Home Entertainment. “Tapping into the excellent performance of Intel® processor architecture through fine-tuning and optimization means that users don’t need to wait.”
For Corel, one of the world’s top software companies, optimization is also important. As Jan Piros, senior strategic product manager at Corel, explained: “Platform optimization is fundamental to our development. A significant amount of our effort goes into this because the gains made can be felt throughout many of our features. It’s an effort whose impact is multiplied throughout the software and of great benefit to the user.”
With each new generation of processor, more cores are added to a single piece of silicon. For example, the 2nd gen Intel® Core™ i7 processor has six cores on a single 32nm chip. Intel® Hyper-Threading Technology enables each core to handle two separate instructions simultaneously. To make use of all that processing power, software developers tune and optimize their code for multi-core, multithreaded operations. This allows the software to utilize all available cores and threads on a system, helping to boost performance in the process.
In addition, 2nd gen Intel Core processors feature Intel® Quick Sync Video and Intel® Clear Video HD Technology, powerful integrated acceleration technology for encoding, decoding, and preprocessing HD video formats and codecs. To access the incredible performance boost that Intel Quick Sync Video and Intel Clear Video HD Technology offer, developers must use the Intel Media SDK.
Speed has a dramatic impact on our user’s experience with our product. It’s clear that, with its technology, support, and dedicated people, Intel understands this—and that makes a big difference for us at Corel. Intel provides the tools that we can use to optimize our software and build a superior user experience. The Intel® Media SDK puts all the tools we need in our hands and helps us deliver maximum performance in our products.”
—JANE PIROS, SENIOR STRATEGIC PRODUCT MANAGER, COREL CORP.
Gathering Accurate Performance Intelligence
Zeroing in on the exact cause of any particular latency, when hundreds of modules and millions of lines of code are involved, is like trying to find the proverbial needle in a haystack. Discovering bottlenecks and analyzing CPU and graphics workloads at the system, task, and intra-frame levels can help save developers a significant amount of time during optimization and development of their application.
Intel GPA provides developers with a suite of analysis tools for visualizing and optimizing applications efficiently from the system level all the way down to individual elements such as draw calls within a single video frame. In addition, Intel GPA lets developers experiment and actually see performance opportunities from optimizations without making source code changes, with the intuitive, standalone GPA Frame Analyzer tool.
Intel GPA includes the following tools and features:
Intel GPA System Analyzer heads-up display provides a system-level view of CPU, GPU, and DirectX* metrics, and measures processor, memory, and graphics performance in real time, revealing potential bottlenecks.
Intel GPA Frame Analyzer allows elements in individual frames to be analyzed and allows developers to see the visual and performance impact of changes without actually changing the code.
Intel GPA Performance Analyzer allows developers to see how the interaction of multiple tasks and subtasks running on CPUs and GPUs affect performance. Support for Intel® HD Graphics 2000/3000 (GPU) hardware enables developers to analyze how efficiently their application takes advantage of hardware acceleration in 2nd gen Intel Core processors. Intel GPA now contains customizable task coloring, support for task sub-states, and more.
The Intel® Instrumentation and Tracing Technology (Intel® ITT) API for the Intel® GPA Platform Analyzer enables the developer’s application to generate and control the collection of trace data during its execution.
Instrumentation allows developers to recognize, analyze, and visualize trace data in the graphics driver and the Intel Media SDK library, and data from the DirectX library. Developers can monitor relationships that occur in the time between task submission and execution, identifying bottlenecks within the context of the entire computing platform.
- OpenCL* support through the Intel® OpenCL SDK allows developers to analyze and optimize performance of OpenCL 1.1 standard code run on Intel multi-threaded processors. This makes it easier to adopt OpenCL technology and to optimize OpenCL content.
Taking the Guesswork Out of Multi-Core Scalability
Intel VTune Amplifier XE is another powerful Intel developer tool for threading and performance optimization. By providing an accurate performance profile that is displayed on a time-line complete with data filtering and frame analysis capabilities, developers can tune their applications based on hard, actionable data rather than educated guesses. Data filtering selectively screens information, such as start-up statistics, that masks accurate results. Performance data is color coded to speed up the process of finding common causes of slow performance in parallel programs such as wait locks; that is, waiting too long for a lock while cores go underutilized.
Intel® VTune™ Amplifier XE and Intel® GPA are integral to our development process. We use Intel VTune Amplifier XE for CPU usage profiling, which helps us pinpoint execution hotspots and analyze potential performance issues in multithreading. We’re currently developing an HD codec using both Intel VTune Amplifier XE and the Intel GPA tool for optimization across the CPU and GPU and for fine-tuning multi-thread scheduling.”
—YANLONG SUN, DEPUTY GENERAL MANAGER OF VIDEO AND HOME ENTERTAINMENT GROUP, ARCSOFT
Every Intel® processor includes an on-chip Performance Monitor Unit (PMU ) to facilitate optimization. Intel VTune Amplifier XE uses the PMU to keep track of various events, and with presets for Intel processor microarchitecture, developers don’t have to keep track of complex event names. In fact, the tool automatically highlights routines that it recognizes as potential candidates for optimization and even offers suggestions as to what might be causing trouble, such as the number of cycles per instruction being too high.
Intel® VTune™ Amplifier XE Performance Profiler helps developers tune single-node threading by visualizing thread behavior, evaluating thread load balancing, and finding thread synchronization bottlenecks. Intel VTune Amplifier XE is available as a standalone application or as a component of Intel® Parallel Studio XE. Intel Parallel Studio XE is another suite of developer tools that consists of Intel® C/C++ and Fortran compilers as well as optimized performance and parallel libraries.
Accelerating Encode, Decode, and Transcode Performance on Intel® Platforms
Intel Media SDK offers developers quick, easy access to hardware-accelerated video encoding, decoding, and pixel preprocessing capabilities with Intel-optimized software fallback. The cross platform API allows developers to take advantage of powerful Intel Quick Sync Video and Intel Clear Video HD Technology hardware acceleration that offloads processor-intensive tasks to the graphics component of 2nd gen Intel Core processors. Even when run on a platform that lacks hardware acceleration, applications created with Intel Media SDK still gain the benefit of tuned, optimized, and multi-threaded software-based video encoding and decoding that’s tailored to the capabilities of the Intel platform on which they are running.
Intel Media SDK is extremely useful for creating video-editing and -processing, media conversion, streaming, and playback, as well as videoconferencing applications. It supports encoding and decoding of H.264, VC-1, and MPEG-2 format content, as well as stereoscopic 3D (S3D) content, through the Multi-View Video Coding (MVC) H.264 extension, all of which are used in applications such as ArcSoft ShowBiz* 5 and MediaConverter* 7, and Corel VideoStudio* Pro X4, among other popular video editing, conversion, and sharing applications developed using Intel Media SDK.
Intel Media SDK 2012 also includes the following benefits:
Dramatically streamlined workflow. Creating video encoding and decoding routines to support multiple hardware platforms can be tedious and time consuming, particularly when dealing with the intricacies of the Microsoft DirectX Video Acceleration (DxVA) interface.
Simplified development. Intel Media SDK 2012 introduces Opaque Memory, a new memory type, which dynamically allocates memory to determine the best memory configuration for the type of client configuration the software is running on, helping to make development easier and applications perform better.
New tools. Intel Media SDK includes the same tracer technology as the other Intel® software developer tools mentioned, which helps make it easier to capture and log calls as they are sent to the media libraries, providing developers with an excellent debugging resource.
Support for new camera usage models. New enhancements make it easier to develop videoconferencing applications such as Skype*, video surveillance, and other applications that use video cameras.
- Integral support for future Intel® architecture advances. The Intel Media SDK provides a flexible and extensible architecture, enabling application support for leading Intel® hardware, beginning with the Intel® Graphics Media Accelerator (Intel® GMA) 4500 HD, Intel HD Graphics, and 2nd gen Intel Core processors. It also extends to future Intel architecture. This allows developers to create applications today using the Intel Media SDK and take advantage of hardware acceleration available now and in the future— without rewriting any program code.
Additionally, Intel Media SDK provides lower latency encode and decode operations, dynamic bit-rate control, forced key-frame insertion, and reference list selection, as well as long-term reference and temporal scalability.
Putting Intel® Software Developer Tools to Work
ArcSoft Fine-Tuning and Optimization
ArcSoft is a leading developer of multimedia imaging technologies and applications for desktop and embedded platforms. The company creates software for smartphones, feature phones, tablets, PCs, smart TVs, and cameras. Its retail video software such as TotalMedia* Theatre*, MediaConverter, and ShowBiz* enables consumers to author, edit, and play back various HD formats such as AVCH D and Blu-ray* on PCs and smart devices.
Optimization is a crucial portion of ArcSoft’s development cycle, and Intel software developer tools play a key role in the process. “Intel VTune Amplifier XE and Intel GPA are integral to our development process,” said Yanlong Sun, deputy general manager of the Video and Home Entertainment Group at ArcSoft. “We use Intel VTune Amplifier XE for CPU usage profiling, which helps us pinpoint execution hotspots and analyze potential performance issues in multi-threading. We’re currently developing an HD codec using both Intel VTune Amplifier XE and the Intel GPA tool for optimization across the CPU and GPU and for fine-tuning multi-thread scheduling.”
ArcSoft utilized Intel Media SDK to optimize their video playback and editing applications, helping it develop easy-to-use, highly scalable products that run on a range of processors, including 2nd gen Intel Core processors and the Intel Atom processor.
ArcSoft worked closely with Intel to optimize its TotalMedia Theatre decoder pipeline for Intel GMA technology in Intel Atom processors. “TotalMedia Theatre delivers smooth, high-quality Blu-ray playback on Intel Atom processors,” Sun said.
Intel VTune Amplifier XE, Intel GPA, and Intel Media SDK were instrumental in allowing ArcSoft to parallelize the core engine used in both ShowBiz and MediaConverter. “Parallel tasking gives our users the ability to simultaneously output finished content to, say, YouTube* and a handheld device format,” Sun said. “The Intel GPA tool gave us a frame-by-frame GPU analysis to help us improve our decode and encode pipelines. Intel’s multi-core, multi-threaded processor technology significantly reduces the conversion time. The user can now convert four or more files concurrently while leaving the processor free for other tasks.”
Looking to the future, ArcSoft has been using a prerelease version of the Intel Media SDK 2012 to develop real-time transcoding technology for videoconferencing. ArcSoft is also working closely with Intel engineers to optimize a number of OEM applications for the Android* OS that run on Intel architecture and Intel Atom processor platforms.
Corel Optimized Performance and Multi-Core Scalability
Corel, one of the world’s top software companies with over 100 million active users in more than 75 countries, develops innovative products that are easy to learn and use. Corel VideoStudio* Pro X4, its flagship video-editing software, offers video makers of all skill levels a comprehensive set of video-editing tools, along with plug-ins for rock-steady video stabilization and broadcast quality titles, animations, and graphics.
In developing VideoStudio Pro X4, Corel engineers used Intel Media SDK and Intel Parallel Studio XE to achieve optimal load balancing between CPU and GPU media-processing pipelines in 2nd gen Intel Core processor architectures. “The decode/encode functions in the latest Intel Media SDK allowed us to achieve very fast transcoding speed, as well as fast read-back between video and system memory,” said Chung- Tao, director of development at Corel.
Intel VTune Amplifier XE helped Corel engineers identify bottlenecks and hotspots by analyzing modules related to a single feature or feature set instead of having to look at the entire VideoStudio Pro code base. Once identified, bottlenecks were eliminated, resulting in code optimized for performance and multi-core scalability.
“The design of 2nd gen Intel Core processors, with its Intel HD Graphics, Intel Quick Sync Video, Intel Clear Video HD Technology, increased parallelism, and greater throughput, can really stand up to the stresses of HD video,” Piros said. “It lets us deliver a video editor with a smooth and responsive creative experience that really wasn’t possible with previous-generation chips.”
Corel’s new Motion Studio 3D* is an easy-to-use 3D and motion-graphics application that makes titles and graphics for video. “MotionStudio is very graphics intensive,” Chung-Tao said. “Looking ahead to future releases, we can absolutely see where Intel GPA will help optimize our very complex and computing-intensive graphics.”
“Speed has a dramatic impact on our user’s experience with our product,” Piros concluded. “It’s clear that with its technology, support, and dedicated people, Intel understands this—and that makes a big difference for us at Corel. Intel provides the tools that we can use to optimize our software and build a superior user experience. The Intel Media SDK puts all the tools we need in our hands and helps us deliver maximum performance in our products.”
About the Author
Before signing on as one of the writing muses for Rose & Her Minions, Dominic Milano spent over 30 years in print, online, and event media production, working on DV magazine, Game Developer magazine and the Game Developer Conference, Keyboard magazine, Guitar Player magazine, and more.
Sign up today for Intel® Visual Adrenaline magazine: www.intelsoftwaregraphics.com »