By Mikhail Smelyanskiy, Jason Sewall, Dhiraj D. Kalamkar, Nadathur Satish, Pradeep Dubey, Nikita Astafiev, Ilya Burylov, Andrey Nikolaev, Sergey Maidanov, Shuo Li, Sunil Kulkarni, Charles H. Finan, Ekaterina Gonina
In the past 20 years, computerization has driven explosive growth in the volume of financial markets and in the variety of traded financial instruments. Increasingly sophisticated mathematical and statistical methods and rapidly expanding computational power to drive them have given rise to the field of computational finance. The wide applicability of these models, their computational intensity, and their real-time constraints require high-throughput parallel architectures.
In this work, we have assembled a financial analytics workload for derivative pricing, an important area of computational finance. We characterize and compare our workload’s performance on two modern, parallel architectures: the Intel® Xeon® Processor E5-2680, and the recently announced Intel® Xeon Phi™ (formerly codenamed ‘Knights Corner’) coprocessor. In addition to analysis of the peak performance of the workloads on each architecture, we also quantify the impact of several levels of compiler and algorithmic optimization.
Overall, we find that large caches on both architectures, out-oforder cores on Intel® Xeon® processor, and large compute and memory bandwidth on Intel® Xeon Phi™ coprocessor deliver high level of performance on financial analytics.