Learn how to write an MPI program in Python*, and take advantage of Intel® multicore architectures using OpenMP threads and Intel® AVX512 instructions.
In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
How to install and enable Offload Over Fabric, configure the hardware, and test the configuration.
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
Many-Fermion Dynamics---nuclear, or MFDn, is a configuration interaction (CI) code for nuclear structure calculations.
Yet Another Stencil Kernel (YASK), is a framework to facilitate design exploration and tuning of HPC kernels including vector folding, cache blocking, memory layout, loop construction, temporal wave-front blocking, and others.YASK contains a specialized source-to-source translator to convert scalar C++ stencil code to SIMD-optimized code.
Cython* is a superset of Python* that additionally supports C functions and C types on variable and class attributes. Cython generates C extension modules, which can be used by the main Python program using the import statement.
Learn techniques for vectorizing code, adding thread-level parallelism, and enabling memory optimization.