One of the Intel® Modern Code Developer Challenge winners, Daniel Falguera, describes many of the optimizations he implemented and why some didn't work.
Não perca a palestra "Como otimizar seu código sem ser um "ninja" em Computação Paralela" da Intel que será ministrada durante a Semana sobre Programação Massivamente Paralela em Petrópolis, RJ, no Laboratório Nacional de Computação Científica. Data: 02/02/2016 - 11h30 Local: LNCC - Av. Getúlio Vargas, 333 - Quitandinha - Petrópolis/RJ
I can. And if you read this post you will also be able to write one, too. (Might be a cool party trick or a sucker bet to make a little cash.)
This paper demonstrates a special version of Caffe* — a deep learning framework originally developed by the Berkeley Vision and Learning Center (BVLC) — that is optimized for Intel® architecture.
See how the new Intel® Advanced Vector Extensions 512CD and the Intel AVX512F subsets (available in the Intel® Xeon Phi processor and in future Intel Xeon processors) lets the compiler automatically generate vector code with no changes to the code.
Cython* is a superset of Python* that additionally supports C functions and C types on variable and class attributes. Cython generates C extension modules, which can be used by the main Python program using the import statement.
Learn more about an in-depth analysis of code modernization performance conducted by optimizing original CPU code and re-running tests on the latest GPU/CPU hardware.