Parallel Beam Backprojection on Sandy Bridge EP

Download article (PDF)

Tomographic image reconstruction is computational very demanding. In filtered backprojection as well as in iterative reconstruction schemes, the most time consuming steps usually are the forward and backprojection.

We here present the performance results achieved with a high performance 3D parallel beam backprojection algorithm that was optimized for Intel’s new Sandy Bridge EP architecture.

Compared to a “naïve” straight forward implementation our optimized algorithm uses Sandy Bridge’s enhanced vector capabilities, i.e. its 256 bit vector instruction set AVX to backproject 8 images simultaneously and an optimized memory layout in order to fully exploit the computational power of Sandy Bridge and thereby to reduce reconstruction time.

Backprojection algorithms in CT imaging are bandwidth limited problems and therefore choosing an optimal memory layout in terms of cache usage is essential in order to fully exploit the computational power of a given system.

Results show that using a cache-optimized memory layout during the backprojection increases performance by about 300% as compared to the case where the backprojection is performed with a non optimized memory layout.

Для получения подробной информации о возможностях оптимизации компилятора обратитесь к нашему Уведомлению об оптимизации.
Возможность комментирования русскоязычного контента была отключена. Узнать подробнее.