Using Intel® Tools to Maximize the Performance of Image Processing Applications

Image Analyzer* Case Study

 

Image Analyzer* is a robust, real-time software application that inspects digital images for illicit content, such as commercial pornography. It is a complex analysis tool that employs eleven different decision variables to determine whether an image is clean or illicit. It is deployed at the point of delivery for email, Internet and cellular traffic. Both accuracy and speed are vital success metrics.

Early versions of this code-heavy software worked well for some applications, such as scanning email attachments. However, Image Analyzer* could not keep up with the volume, pace and complexity of content being generated and disseminated on the Internet and via cell networks. Using high performance hardware yielded limited performance gain as the Image Analyzer* application was not designed to leverage the features of the multi-core architecture of the latest systems.

 

Image Analyzer* engineers partnered with Intel’s performance engineering team in India to optimize/tune the application for better performance on multi-core systems. Two software updates were delivered following the code tune-up resulting in a combined 350 percent performance gain in both image volume and accuracy using the same hardware platform. Even greater improvements were expected when the software would run on systems with more processing cores. Early customers found that the accuracy of the application was better than database-driven applications but the limited performance gain, confined its use to small-volume traffic centers. Image Analyzer* worked with us at Intel to improve the performance of the application.

 

During the first phase of optimization for the version 3.0 of the Image Analyzer* Software,we used Intel® VTune™ Performance Analyzer, which inspects and analyzes code to identify bottlenecks in the architecture. We also used Intel® Thread Checker and Intel® Thread Profiler to analyze threading performance and identify possible threading bugs. With these three tools we were able to determine where the application was getting bogged down and why its performance was suffering. The analysis pointed to three critical problems with the code and architecture:

 

  • Lack of multithreading in the application
  • Some functions used excess memory allocations unnecessarily
  • The software engine was not thread safe

Armed with these results, Image Analyzer* engineers used the same Intel tools to revise their application code and make it more efficient – a process that approximately took three months. After optimizing the memory, they re-architected the software for threading, enabling it to run parallel threads on multi-core processors and gain performance accordingly.

 

Stephen Tye, Director of Product Management for Image Analyzer* says, “The results of the first project were astounding. Using an Intel® Core™2 Quad Processor-based system, we achieved a performance gain on the processing of clean images of 167%, and a performance gain on the processing of illicit images of 337%.”

 

In the next release of the software (version 4.0), Image Analyzer* wanted to improve the accuracy of their software further without degrading performance. So we made a lot of changes in the algorithms for processing an image and to improve accuracy. This version of their software was also run through the Intel tools to ensure that the code architecture would deliver maximum performance.

 

In addition to using the above-mentioned Intel tools, Image Analyzer* also used Intel® Compilers to further improve the performance of the application version 4.0. This allowed us to take advantage of Intel® Streaming SIMD Extensions 4 (SSE4) technology, which is targeted at improving the performance of media, imaging, and 3D workloads. The SSE4 technology also provides hints that help to improve memory throughput when reading from un-cacheable Write Combining memory type.

With the release of Image Analyzer* version 4.0, using the same platform that was used to test version 3.0, we realized a 53% performance gain analyzing clean images, and 16% performance gain for illicit images. Click here for more details on this optimization effort.

 

This latest code tune-up has opened new market segments for Image Analyzer*, as Tye confirms, “…now we have the performance to match the need. We’re especially able to keep up with the volume of content going through mobile technology and social networking. We’re finally able to analyze it in a real world scenario and process that much content with improved accuracy. It’s exactly what we’ve been aiming for.”

 

Download the detailed case study

有关编译器优化的更完整信息,请参阅优化通知