Advanced Image Reconstruction Algorithm Running on Intel® Xeon® Processors
By Kerry Evans (Intel), Terry Sych (Intel), and Steven Johnson (Intel) in collaboration with GE Healthcare
Download New Levels of CT Image Performance and New Levels in Radiation Dose Management [PDF 1.5MB]
Advances in computed tomography (CT) have increased dramatically during the past ten years, offering a non-invasive technique for examining patients as an alternative for exploratory surgeries that were once routine clinical practice. Significant strides have also been made in the effort to minimize radiation exposure associated with CT.
Paving the way for improved image quality at lower dose is GE*'s model-based iterative reconstruction (MBIR) technology, called Veo*1 (pronounced vay-oh). MBIR algorithms are able to reduce noise2 and improve the resolution of reconstructed images compared to traditional reconstruction methods. This allows physicians to obtain the information desired for diagnosis using dramatically less dose. In development, the MBIR approach showed great promise by significantly increasing image quality, but processing times were too long for clinical use. This obstacle was overcome when GE and Intel engineers teamed up to reduce the processing time from several days for one case to one hour for multiple cases. Veo is now implemented in the GE Discovery* CT750 HD (Figure 1), and in early evaluation has generated head, chest and abdomen CT cases using X-ray dosages of less than 1 millisievert (mSv), well below the typical average dosages of 5 to 10 mSv required today. By comparison, average exposure to background radiation is about 3 mSv per year.3
The ultimate goal in CT, of course, is to provide clinicians with the highest quality images as fast as possible, maximizing diagnostic accuracy and speed, while optimizing staff productivity and patient throughput. This brief reviews key MBIR technology and the steps taken to dramatically decrease CT image processing time on servers equipped with the Intel® Xeon® processor.
1 In clinical practice, the use of Veo may reduce patient CT dose depending on the clinical task, patient size, anatomical location and clinical practice. A consultation with a radiologist and a physicist should be made to determine the appropriate dose to obtain diagnostic image quality for the particular clinical task.
2 Noise as measured as pixel noise standard deviation.
3 Mettler, Jr. FA, et al., Effective Doses in Radiology and Diagnostic Nuclear Medicine: A Catalog, Radiology, July 2008, Vol. 248, No. 1, pp. 254-263
CT scanning is a significant advance in medical imaging for aiding in the diagnosis of a wide range of illnesses and injuries. The raw data from a CT scan is not an image, and therefore is not readable by a human; the data needs to be processed to reconstruct and produce cross-sectional images clinicians can view. CT dates back to the 1970’s in the UK, where Godfrey Hounsfield developed the first CT scanner that was used to scan a human brain.4
Despite major advances in X-ray sources, detector arrays, gantry mechanical design and computer performance, one component of computed tomography (CT) scanners has remained virtually constant for the past 25 years-the reconstruction algorithm.5
Transition to MBIR
A technique known as filtered back-projection (FBP) has been the foundation of commercially available CT reconstruction techniques since the 1970’s. Its advantages are speed and formulas with a closed-form solution, requiring just a single pass over the acquired data. Although FBP is still widely used today, image quality is particularly sensitive to both signal and noise levels. The signal level is established by selecting the proper scanning protocols (output level of the X-ray source, CT gantry rotation speed, detector collimation, helical pitch, etc.,). Noise can be the result of fluctuations in the X-ray flux, detector signal generation process, electronic noise in the data acquisition system, as well as the attenuation properties of the patient body to the X-ray. While these noise sources can be characterized, FBP does not account for them in the reconstruction process. Thus, as radiation level is decreased (i.e., signal level is reduced) image quality suffers. In order to make the mathematics manageable, FBP ignores the important geometric properties of the CT system and assumes perfect responses for all components. (i.e., ideal point x-ray source, ideal point detector, and infinitely small image voxel). These simplifications lead to a suboptimal spatial resolution of the reconstructed images.
A powerful new class of iterative reconstruction algorithms has been designed to reconstruct much higher quality images from CT scan data obtained at greatly reduced radiation levels. Veo’s model-based iterative reconstruction (MBIR) algorithm utilizes system noise models, in addition to X-ray absorption models, to produce high resolution images that allow clinicians to see anatomy in much more detail. Compared to FBP reconstruction, Veo produces much sharper HD images, as shown in Figure 2.
Today and Future
Due to limitations in computing power and reconstruction technology, model-based iterative approaches have not been practical for commercial CT scanners until now. Processing times have been reduced to a point where Veo can be used in today’s clinical workflow. Continued decreases in processing times are expected over time, driven by advancements in computing hardware and enhancements to MBIR algorithms.
Advanced model-based algorithms can extract more information from CT scan data. Our design philosophy with Veo* is to provide previously unattainable levels of combined resolution improvement and noise reduction in CT images in order to enhance diagnostic information at dramatically lower dose. - Jiang Hsieh, Chief Scientist, GE Healthcare.
Benefits of Veo* Technology
Launched in 2008 as the first iterative reconstruction technology, adaptive statistical iterative reconstruction (ASiR) is making a profound impact on over 1,000 GE CT scanners by enabling radiologists to obtain the images they desire. ASiR may allow the use of a lowered mA protocol, thereby reducing the required dose by up to 50 percent on Discovery CT750 HD6,7. Veo offers the next step in performance. Today, routine exams like chest, abdomen and pelvis scans may require 4-10 millisieverts (mSv) of X-ray exposure. However, Veo technology promises to deliver enhanced image quality using less than 1 mSv.
When compared to previous GE reconstruction methods, Veo’s capabilities change the rules of CT imaging by applying revolutionary new modeling techniques to deliver lower noise, resolution gain and artifact suppression. Clinicians will benefit from higher spatial resolution and improved low-contrast detectability when diagnosing or treating disease. The impact is like the movement from standard TV to high-definition TV.
“We are at the beginning of a very interesting advancement in CT imaging, and the future of Veo* lies in every application. In my opinion, it is one of the major advancements in CT imaging-as important as the development of helical and multi-detector CT.”
Jean-Louis Sablayrolles, MD, Centre Cardiologique du Nord (CCN), Saint-Denis, France.
Remarkably, Veo has the ability to deliver these improvements at unprecedented low dose levels, benefiting patients, as illustrated in Figure 3.
6 In clinical practice, the use of ASiR may reduce CT patient dose depending on the clinical task, patient size, anatomical location and clinical practice. A consultation with a radiologist and a physicist should be made to determine the appropriate dose to obtain diagnostic image quality for the particular clinical task.
7 ASiR dose reduction was measured on a standard 20cm water phantom. The test involved maintaining constant pixel standard deviation as the mA was reduced, from 300 to 150mA, at 120 kV.
MBIR Algorithm Challenges
The key to reconstruction algorithm design is to accurately model the interactions of radiation with matter as X-rays are produced in the tube, attenuated through the patient, measured at the detector, and transformed into a digital signal. In order to deliver a faithful representation of the actual process, individual models for measured noise statistics, system optics, radiation and detection physics, medical image characteristics, etc., have to be developed to reconstruct images that accurately represent the acquired scan data.
Because of the complexity of the model description, no closed-form solution is available. Therefore, the estimate of the solution is iteratively determined through multiple passes over the scan data.
MBIR reconstruction requires much more computing power than its predecessor, filtered back-projection. At the early stages of research, the MBIR algorithm took multiple days to process an image, clearly too long for use in clinical practice. Subsequently, the reconstruction time has been decreased to the point where multiple exams can be processed per hour using a multi-pronged approach incorporating the following:
- Algorithm optimization resulting from joint research of GE engineers and academic research partners
- Intel® processor microarchitecture improvements made over several years
- A very dense, high performance IBM* Blade Server system
- Algorithms tuned by GE and Intel engineers to increase speed on Intel Xeon processors
In 2006, GE and Intel began collaborating with the objective of dramatically reducing the processing time of the MBIR algorithms. GE produced a simpler version of the MBIR application that was easier to benchmark and sent it to Intel for analysis. Intel looked closely at memory accesses and profiled where the CPU spent the most time. Impressive performance gains were achieved by restructuring the algorithms to run more efficiently on multi-core processors and reordering data structures to reduce cache misses and paging. Over this time period, Intel had introduced two new generations of processors, boosting performance significantly with the Intel Xeon processor that features an integrated memory controller and Intel® QuickPath Interconnect (Intel® QPI) technology. These hardware and software enhancements are described in more detail in the following.
Initially, the MBIR algorithm was single threaded and could only run on a single processing core. The code was split into two threading models: a data parallel threading model to split the number of data elements among available threads and a process parallel model that divided types of processing among the threads. This multithreaded approach enabled the software to take advantage of the full processing capacity of a board with two Intel Xeon processors, for a total of 8 cores and 16 threads. Threading the code produced a speed up of about 10 times, as illustrated in Figure 4.
There are 16 threads because the processor incorporates Intel® Hyper-Threading Technology (Intel® HT Technology)8, which allows each physical core to process two software threads concurrently, increasing performance by as much as 30 percent.9,10,11 More details about Intel HT Technology are presented in the section below. Multithreading made it necessary to introduce a low overhead synchronization mechanism to keep all cores working together and prevent data issues, such as race conditions.
Two Threads Per CoreIntel® Hyper-Threading Technology (Intel® HT Technology)8 provides separate data paths for two tasks, which means the processor maintains two execution states at the same time. As a result, the CPU will process another task if the task it’s executing stalls (e.g., waiting for an I/O device), which eliminates wasteful idle time.
The performance improvement derived from Intel HT Technology is illustrated in Figure 5, showing three multi-tasking examples. First, the tasks are executed sequentially, task 1 followed by task 2. Second, the tasks are assigned alternating time slots. These first two examples require about the same amount of time because they both incur significant delays when the CPU must wait for data. Third, Intel HT Technology executes both tasks concurrently, taking advantage of idle time to work on another task and thus reducing overall execution time. The key benefits of Intel HT Technology are greater performance, lower power and smaller form factor, compared to adding another processor to increase compute capacity.
Figure 5.Three Multi-Tasking Examples
The data stride was analyzed, which identified how to order the 3D data to reconstruct the image in the most efficient manner. A close examination of the data indices revealed the stride was not uniform and data accesses were scattered all over the 4GB dataset. The data was rearranged to ensure indices always increased and to maximize cache reuse from subset to subset. These changes also allowed the processors’ prefetchers to work more efficiently, which improved the cache hit rate.
Using the Intel® VTune™ Performance Analyzer, it was possible to identify additional penalty associated with memory access and memory paging. With a proper tuning of the memory configuration, this penalty became insignificant. Overall, data reordering and memory tuning doubled performance.
The Intel VTune Performance Analyzer also helped identify hot spots in the algorithm. Critical areas were hand-coded using Intel® Streaming SIMD Extensions (Intel® SSE) intrinsics to maximize use of the vector processing capabilities of the Intel Xeon processor.
GE engineers also switched from the GNU* GCC* compiler to the Intel® C++ Studio XE, improving performance by about 20 percent.
During the course of the project, Intel launched two new generations of Intel Xeon processors with greater performance, which further decreased the processing time of the MBIR algorithm. In all, the combination of the previously discussed software optimizations and Intel® microarchitecture improvements enabled about a one hundred times increase in efficiency of the MBIR algorithm. Some of the Intel Development Tools available to help optimize software are described in the section below.
Intel Development Tools OverviewDevelopers of signal processing applications have a wide choice of development tools from Intel and the broad Intel ecosystem. The benefits of using these comprehensive tool suites are many and impact every phase of the software development process.
Intel® C++ Compiler
The Intel® C++ Compilers for Linux* and Microsoft* Windows* operating systems are optimized to harness key properties of Intel® architecture processors and deliver optimal performance. They take advantage of a complex set of heuristics to decide which assembly instructions can best optimize the performance in various areas, including memory access, branch prediction, vectorization and floating point operations.
Intel® Math Kernel Library (Intel® MKL)
Intel® Math Kernel Library (Intel® MKL) is a library of highly optimized, extensively threaded math routines that rely heavily on floating point computations for maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK, Sparse Solvers, Fast Fourier Transforms, Vector Math and more.
Intel® Integrated Performance Primitives (Intel® IPP)
Intel® Integrated Performance Primitives (Intel® IPP) offers a rich set of library functions and codecs capable of speeding up the development of highly optimized routines for the handling of multimedia formats and data of any kind. They have been hand optimized at a low level to provide maximum performance and ease of use with Intel® processor-based platforms.
Intel® VTune™ Performance Analyzer
Designed to help developers find bottlenecks in their applications, the tool profiles how the application is using CPU time and computing platform resources throughout the code.
Intel® Application Debugger
A rich and user friendly Eclipse* RCP-based graphical user interface, combined with OS signal and thread awareness, enables developers to cross-debug more easily by finding coding issues that affect application runtime behavior.
Eclipse*-based Integrated Development Environment
Intel® software development products can be used with the Eclipse Integrated Development Environment (IDE).
Advancing Image Reconstruction
Over the past several years, CT has continued to demonstrate value due to its versatile diagnostic capability, non-invasive application and ability to visualize fine anatomic detail. Innovative new technologies have improved the diagnostic information available to clinicians while lowering radiation dose. GE believes further advancements, like Veo, offer potential for another leap in resolution and reduction of patient radiation exposure - even below the level of today's CT scanners.
This is possible, in part, due to the exceptional computing performance of Intel Xeon processors and dense server design from IBM. In addition, Intel engineers helped GE optimize their MBIR algorithms to increase speed on Intel Xeon processors from several days to under an hour. Another important consideration for GE was the fact that both Intel and IBM support long product life cycles, a necessity in this market segment where medical equipment must undergo long and extensive regulatory approval.
To learn more about Computed Tomography solutions from GE Healthcare, please visit http://www.gehealthcare.com/euen/ct/products/discovery_ct750hd/index.html
To learn more about Intel's solutions for Embedded Computing, please visit www.intel.com/go/medical.
About the Authors
Kerry Evans is a Platform Architect in the Intelligent Systems Group at Intel Corporation where he works with customers to design and implement solutions based on Intel hardware and software technologies. Kerry joined Intel in 2005 with 30 years of experience in software development, healthcare and medical research. He received his B.S. in Electrical Engineering in 1975 and M.Eng. and Ph.D. in Bioengineering in 1977 and 1979, respectively, from the University of Utah. He holds 4 US patents.
Terry Sych is a Staff Software Engineer in the Platform Architecture Enabling group at Intel Corporation. He joined Intel in 1992, and has worked on performance analysis and software optimization of enterprise applications for the last 10 years. Terry works with software vendors analyzing, tuning, and optimizing applications. He received a B.S. degree in Computer Engineering from the University of Michigan in 1981 and an MSEE from the University of Minnesota in 1988. He holds 3 US patents.
Steven Johnson is the GE Healthcare Alliance Manager at Intel.
4 Source: http://nobelprize.org/nobel_prizes/medicine/laureates/1979/perspectives.html
5 Source: Xiaochuan Pan, Emil Y Sidky and Michael Vannier, 2009 http://iopscience.iop.org/0266-5611/25/12/123009
8 Intel® Hyper-Threading Technology (Intel® HT Technology) requires a computer system with an Intel® processor supporting Intel HT Technology and an Intel HT Technology enabled chipset, BIOS, and operating system. Performance will vary depending on the specific hardware and software you use. See www.intel.com/products/ht/hyperthreading_more.htm for more information including details on which processors support Intel HT Technology.
9 Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
10 For more information go to http://www.intel.com/performance.
11 For an application example, see /en-us/articles/intel-hyper-threading-technology-analysis-of-the-ht-effects-on-a-server-transactional-workload
“We believe this new technology will play a key role in moving us towards our goal of the 1 mSv study and enable clinicians to achieve their diagnostic needs at unheard of doses without compromising image performance.”
Steve Gray, Vice President & General Manager of Computed Tomography & Advantage Workstation for GE Healthcare*