面向英特尔® 架构优化的 Caffe*:使用现代代码技巧

This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
Authored by Last updated on 07/06/2019 - 16:40

整理您的数据和代码: 数据和布局 - 第 2 部分

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Authored by David M. Last updated on 07/06/2019 - 16:40

借助 SIMD 数据布局模板优化数据布局

Financial service customers need to improve financial algorithmic performance for models such as Monte Carlo, Black-Scholes, and others. SIMD programming can speed up these workloads. In this paper, we perform data layout optimizations using two approaches on a Black-Scholes workload for European options valuation from the open source Quantlib library.
Authored by Nimisha R. (Intel) Last updated on 12/12/2018 - 18:00