In this article an OpenMP* based implementation of the Ant Colony Optimization algorithm was analyzed for bottlenecks with Intel® VTune™ Amplifier XE 2016 together with improvements using hybrid MPI-OpenMP and Intel® Threading Building Blocks were introduced to achieve efficient scaling across a four-socket Intel® Xeon® processor E7-8890 v4 processor-based system.
This article will describe performance considerations for CPU inference using Intel® Optimization for TensorFlow*
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
This article explores what happens when Intel solutions support functional and logic programming languages that are regularly used for Artificial Intelligence (AI) and proposes a Prolog interpreter recompilation using Intel® C++ Compiler and libraries in order to evaluate their contribution to logic based AI.
Optimize performance of inference service on CPUs and save computing resources using the iQIYI deep learning cloud platform
In continued efforts to optimize Deep Learning workloads on Intel® architecture, our engineers explore various paths leading to the maximum performance.
This paper introduces the Artificial Intelligence (AI) community to Intel® optimization for TensorFlow* on Intel® Xeon® and Intel® Xeon Phi™ processor-based CPU platforms.
Boosting Deep Learning Training & Inference Performance on Intel® Xeon® and Intel® Xeon Phi™ ProcessorsIn this work we present how, without a single line of code change in the framework, we can further boost the performance for deep learning training by up to 2X and inference by up to 2.7X on top of the current software optimizations available from open source TensorFlow* and Caffe* on Intel® Xeon® processors.