In interpreted languages, it just takes longer to get stuff done - I earlier gave the example where the Python source code a = b + c would result in a BINARY_ADD byte code which takes 78 machine instructions to do the add, but it's a single native ADD instruction if run in compiled language like C or C++. How can we speed this up? Or as the performance expert would say, how do I decrease...
This article will describe performance considerations for CPU inference using Intel® Optimization for TensorFlow*
本文将介绍使用面向 TensorFlow 的英特尔® 优化* 进行 CPU 推理的性能注意事项