HPCWire Videos

HPCWire Videos

Coding the Future, Intels Vision
James Reinders, Director and Parallel Programming Evangelist, talks about Intel’s vision of consistent, standards-based software development tools. He also discusses his new book that gives an introduction to programming for parallelism on Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors.

Vectorization with array notation
Vectorized software effectively utilizes SIMD instructors to operate on 2, 4 or 8 times as many variables as scalar code. This is especially important on the Intel® Xeon Phi™ coprocessor. An array notation coding style ensures that the compiler generates code utilizing SIMD operations. The array notation syntax is frequently easier to read and may require fewer lines of code.

Vectorization with pragmas
This session explains how to effectively use pragmas and directives to vectorize software in both Fortran and C++/C. Vectorized software effectively utilizes SIMD instructions on processors, which operate on 2,4 or 8 elements at once and provide far better efficiencies and faster software. Utilization of the SIMD operations is essential for good performance on Intel® Xeon Phi™ coprocessor.

Data alignment for effective vectorization in Fortran and C++/C
Along with adopting either an array syntax programming style or selecting the proper pragmas/directives for vectorization, data alignment is one more step developers take to increase the efficiency and performance of their software. Learn how easy it is to align data and increase performance. This is particularly important for Intel® Xeon Phi™ coprocessor.

Faster math performance with Intel® Math Kernel Library
The Intel® Math Kernel Library (Intel® MKL) provides many optimized math routines including BLAS routines, LAPACK, FFT and sparse solvers. The session shows how easy it is to link Intel MKL into your software and demonstrates potential performance gains on Intel processors and Intel® Xeon Phi™ coprocessor.

Automatic offload with Intel® Math Kernel Library
The fastest way to performance is to use highly optimized libraries such as Intel® Math Kernel Library (Intel® MKL). The automated offload capabilities of Intel MKL makes it even easier to take advantage of Intel® Xeon Phi™ coprocessor. Select Intel MKL functions will automatically detect when it would be beneficial to offload the operations onto the Intel Xeon Phi coprocessor.

Threading with OpenMP*
This session introduces you to OpenMP pragmas/directives. When using the OpenMP threading library, developers make use of simple pragmas/directives to identify parallel regions and tasks to develop threaded software. The OpenMP threading process is one of the most popular among HPC programs.

Simplified threading with Intel® Cilk™ Plus
Intel® Cilk™ Plus threading is a highly efficient threading model. Its simplicity, using three simple constructs, belies its power and flexibility. This session will provide developers with an introduction to get started and explore this threading model for parallelism.

Threading with Intel® Threading Building Blocks
This session explains the usage of the popular Intel® Threading Building Blocks (Intel® TBB) on the Intel® Xeon Phi™ coprocessor. Intel TBB is a template-based C++ programming library that contains many common scalable constructs. It’s one of the most popular threading models used by C++ programmers.

Performance analysis with Intel® VTune™ Amplifier XE
Understanding what your software is doing on the processor is the first step towards tuning your software to run faster. In this session we introduce VTune™ Amplifier XE and show how to get started by collecting hotspots in your software, as well as where to get more information for deeper analysis.

Distributed Computing with Intel® MPI Library
Learn how to use Intel MPI on Intel® Xeon Phi™ coprocessor. The Intel Xeon Phi coprocessor runs a full Linux* OS. Learn how to specify usage with Intel MPI library.

Analyzing and Balancing MPI Applications
Learn how to collect and analyze data to tune your MPI application running on Intel® Xeon Phi™ coprocessors. This session demonstrates how Intel® Trace Analyzer can tune MPI applications to improve performance.

For more complete information about compiler optimizations, see our Optimization Notice.