The goal of the N-Body problem is to predict the motion of a set of n objects interacting with each other by some force, e.g. the gravitational force. N-Body simulations have been used in particles simulation such as astrophysical and molecular dynamics simulations. There are a number of approaches for solving the N-Body problem, such as the Barnes-Hut algorithm, the Fast Multipole method, the...
LAMMPS is an open-source software package that simulates classical molecular dynamics. As it supports many energy models and simulation options, its versatility has made it a popular choice. It was first developed at Sandia National Laboratories to use large-scale parallel computation.
Programming for Multicore and Many-core Products including Intel® Xeon® processors and Intel® Xeon Phi™ X100 Product Family coprocessorsThe programming models in use today, used for multicore processors every day, are available for many-core coprocessors as well. Therefore, explaining how to program both Intel Xeon processors and Intel Xeon Phi coprocessor is best done by explaining the options for parallel programming. This paper provides the foundation for understanding how multicore processors and many-core coprocessors are...
The sample demonstrates how to implement efficient median filter with OpenCL™ standard. This implementation relies on auto-vectorization performed by Intel® SDK for OpenCL Applications compiler.
Chapter 1 – Introduction
Download for Windows*
Simple Optimizations sample demonstrates simple ways of measuring the performance of OpenCL™ kernels in an application. It describes basics of profiling and important caveats like having dedicated “warming” run. It also demonstrates several simple optimizations, some of optimizations are rather CPU-specific (like mapping buffers), while others are more general (like using relaxed-math). The...
Demonstrates how to implement an efficient sorting routine with the OpenCL™ technology that operates on arbitrary input array of integer values. The sample uses properties of bitonic sequence and principles of sorting networks and enables efficient SIMD-style parallelism through OpenCL vector data types. The code is designed to work well on modern CPUs.