我们将讨论 OpenMP for 循环中的并行规约。
When confronted with nested loops, the granularity of the computations that are assigned to threads will directly affect performance. Loop transformations such as splitting and merging nested loops can make parallelization easier and more productive.
Many applications and algorithms contain serial optimizations that inadvertently introduce data dependencies and inhibit parallelism. One can often remove such dependences through simple transforms, or even avoid them altogether through.
An Intro to Multi-Level Parallelism for High-Performance Computing by Clay Breshears | Life Sciences Software Architect, Intel
避免并发现线程之间的假共享 (PDF 218KB)
Checksums are widely used for checking the integrity of data in applications such as storage and networking. We present fast methods of computing checksums on Intel® processors. Instead of computing the checksum of the input with a traditional linear method, we describe a faster method to split the data into a number of interleaved parallel streams, compute the checksum on these segments in...
The Black-Scholes benchmark is a one of the 13 benchmarks in the PARSEC. This benchmark does option pricing with Black-Scholes Partial Differential Equation (PDE). The Black-Scholes equation is a differential equation that describes how, under a certain set of assumptions, the value of an option changes as the price of the underlying asset changes. Based on this formula, one can compute the...
Monte Carlo 使用统计计算方法解决复杂的科学计算问题。 它创新地使用随机数字模拟一个问题输入结果的不确定性，并通过处理重复的参数抽样获得一个确定的结果和解决一些以其他方式无法解决的问题。 该方法最早起源于上世纪 40 年代末，由参与“曼哈顿”计划的核物理学家们率先提出。 并采用摩纳哥最大的赌城 Monte Carlo 来命名。
Our building block is the FD compute kernels that are typically used for RTM (reverse time migration) algorithms for seismic imaging. The computations performed by the ISO-3DFD (Isotropic 3-dimensional finite difference) stencils play a major role in accurate imaging of complex subsurface structures in oil and gas surveys and exploration. Here we leverage the ISO-3DFD discussed in  and  and...