In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
This article will describe performance considerations for CPU inference using Intel® Optimization for TensorFlow*
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
The computer learning code Caffe* has been optimized for Intel® Xeon Phi™ processors. This article provides detailed instructions on how to compile and run this Caffe* optimized for Intel® architecture to obtain the best performance on Intel Xeon Phi processors.
In continued efforts to optimize Deep Learning workloads on Intel® architecture, our engineers explore various paths leading to the maximum performance.
This document is designed to help users get started writing code and running MPI applications using the Intel® MPI Library on a development platform that includes the Intel® Xeon Phi™ processor.