我们将讨论 OpenMP for 循环中的并行规约。
An Intro to Multi-Level Parallelism for High-Performance Computing by Clay Breshears | Life Sciences Software Architect, Intel
Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Checksums are widely used for checking the integrity of data in applications such as storage and networking. We present fast methods of computing checksums on Intel® processors. Instead of computing the checksum of the input with a traditional linear method, we describe a faster method to split the data into a number of interleaved parallel streams, compute the checksum on these segments in...
When confronted with nested loops, the granularity of the computations that are assigned to threads will directly affect performance. Loop transformations such as splitting and merging nested loops can make parallelization easier and more productive.
Many applications and algorithms contain serial optimizations that inadvertently introduce data dependencies and inhibit parallelism. One can often remove such dependences through simple transforms, or even avoid them altogether through.
Our building block is the FD compute kernels that are typically used for RTM (reverse time migration) algorithms for seismic imaging. The computations performed by the ISO-3DFD (Isotropic 3-dimensional finite difference) stencils play a major role in accurate imaging of complex subsurface structures in oil and gas surveys and exploration. Here we leverage the ISO-3DFD discussed in  and  and...
The computer learning code Caffe* has been optimized for Intel® Xeon Phi™ processors. This article provides detailed instructions on how to compile and run this Caffe* optimized for Intel® architecture to obtain the best performance on Intel Xeon Phi processors.
The Black-Scholes benchmark is a one of the 13 benchmarks in the PARSEC. This benchmark does option pricing with Black-Scholes Partial Differential Equation (PDE). The Black-Scholes equation is a differential equation that describes how, under a certain set of assumptions, the value of an option changes as the price of the underlying asset changes. Based on this formula, one can compute the...