Intel® Optimizations for TensorFlow
UPDATE: Intel Optimizations for Tensorflow* 1.10 is now available.
Intel® Optimizations for TensorFlow 1.9 is now available. These binary packages are provided as a convenience for those who do not wish to compile TensorFlow from source in order to achieve the CPU performance when TensorFlow is built with support for Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN).
All binaries were built against the TensorFlow v1.9.0 tag with the following patches were applied to the sources before generating the binaries:
- A patch to upgrade the curl library to 7.60 See https://github.com/tensorflow/tensorflow/pull/20181.
- A patch to include the license file for Intel® MKL-DNN. See https://github.com/tensorflow/tensorflow/pull/20936.
- A patch to improve Intel MKL-DNN performance on older Intel processors code-named Sandy Bridge and Ivy Bridge. See https://github.com/tensorflow/tensorflow/pull/20576.
All binaries were generated in an Ubuntu 16.04 container with gcc 5.4.0 and glibc 2.26.1 with following compiler flags (shown below as passed to bazel):
--config=mkl --copt=-march=sandybridge --copt=-mtune=ivybridge --copt=-O3 --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0
These flags instruct bazel and the gcc compiler to build with support for Intel® MKL-DNN and to include AVX and other instructions that are only available on platforms formerly code-named "Sandy Bridge" and newer. Also, the whl packages will only work on systems with gcc >= 5.4 and glibc >= 2.26. You can determine what version of gcc you have with the following command:
$ gcc --version
And your glibc version can be determined with this command:
$ ld --version
If your system has older versions of these components, please use the containers.
Detailed Release Notes are below. The binaries are available as both python whl packages and asDocker containers.For download links and detailed installation instructions, please see the Intel® Optimization for TensorFlow* Installation Guide.
These release notes only apply to the changes that are introduced when TensorFlow is built with support for Intel® MKL-DNN. For features and fixes that were introduced in TensorFlow 1.9, please see the TensorFlow 1.9 release notes.
New Functionality and Usability Improvements
- Changed inter_op_parallelism_threads defaults when built with MKL to avoid thread oversubscription. See TensorFlow Optimizing for CPU for more information.
- Added default OpenMP* OMP settings that are expected to give reasonable performance when using MKL kernels.
- Added feature to query CPUID to determine the number of hyperthreads per physical core on Intel 64-bit architectures.
- Upgraded Intel MKL-DNN to version 0.14.
- Changed KMP_BLOCKTIME environment variable to 0.
- Increased default inter_op_parallelism_threads parameter to be less conservative.
- Enhanced Conv2d forward performance by re-using MKL-DNN primitives.
- Removed use of the deprecated StringPiece class.
- Updated the tensorflow/compiler/aot tests for 64-byte alignment
- Updated Tensor.Slice_Basic for 64-byte alignment
- Updated ScopedAllocatorConcatOpTest.Reshape for 64-byte alignment
- Fixed registration issues for MKL_ML ops.
- Fixed BFCAllocator::Extend alignment issues.
- Fixed a build issue related to MklConcat when using older versions of gcc.
- Fixed alignment crashes in AVX512 builds.
- Fixed the convrnn unit test failure.
- Fixed the mkl_layout_pass_test failure
- Fixed the util_cuda_kernel_helper_test_gpu failure when building with MKL-DNN enabled.
- Fixed a unit test failure for Intel MKL-DNN where memory allocation check failed.
- Fixed a failure in //tensorflow/python/profiler:model_analyzer_test
- Fixed the error when looking for libhdfs.so on Mac; Mac OS uses libhdfs.dylib.
- Fixed a bug in mkl_input_conversion op when reorder is not needed.
- Set EIGEN_MAX_ALIGN_BYTES=64 to prevent crashes during the execution of the unit tests when they are compiled with AVX512 support.
- Fixed a unit test single_machine_test.cc due to special nodes inserted when Intel MKL-DNN is enabled.
The following issues were fixed after the branching of TensorFlow 1.9:
- Undefined reference to 'dladdr' at link time if built with gcc6.3.
- Concat-related ops fail with mixed formats/layouts when using Intel MKL-DNN
- Incorrect MklConv2DWithBiasBackpropBias registration when compiling with Intel MKL-DNN; should only be registered when compiling with Intel MKL-ML.
*Other names and brands may be claimed as the property of others.