Already a couple of years ago, the Bit Manipulation Instruction Set 1 (BMI1) introduced the instruction BLSR, which resets the lowest bit that is set. (The corresponding intrinsic _blsr_u32/64 wraps this instruction with some nice C/C++ function call syntax.) However, what are your options when you not only want to delete one bit, but a given number of bits n? This blog presents multiple variations of this theme including a performant implementation.
In Part 8 we integrate the GUI with the back end. We examine implications of mixing managed code with enclaves and how to mitigate the potential for undermining security gained from Intel® SGX.
Common techniques for fine-tuning the performance of automatically vectorized loops in applications for Intel® Xeon Phi™ coprocessors are discussed. These techniques include strength reduction, regularizing the vectorization pattern, data alignment and aligned data hint, and pointer disambiguation.
MILC software represents a set of codes written by the MIMD Lattice Computation collaboration used to study quantum chromodynamics, the theory of the strong interactions of subatomic physics. This article provides instructions for code access, build, and run directions for the “ks_imp_rhmc” application on Intel® Xeon® processors and Intel® Xeon Phi™ processors.
The latest version of MXNet includes built-in support for the Intel® Math Kernel Library (Intel® MKL) 2018. The latest version of the Intel MKL includes optimizations for Intel® Advanced Vector Extensions 2 (Intel® AVX2) and AVX-512 instructions which are supported in Intel® Xeon® processor and Intel® Xeon Phi™ processors.
As the leading framework for Distributed ML, the addition of deep learning to the super-popular Spark framework is important, because it allows Spark developers to perform a wide range of data analysis tasks—including data wrangling, interactive queries, and stream processing—within a single framework. Three important features offered by BigDL are rich deep learning support, High Single Node Xeon Performance, and Efficient scale-out leveraging Spark architecture.
This article completes an analysis of a problem erroneously reported on the Intel® Developer Zone forum: Vectorization failed because of unsigned integer? It provides a more detailed examination showing that unsigned integer is not impacting compiler vectorization but what methodology to use when a modern C/C++ compiler fails to auto-vectorize for-loops.
The following is a quick guide on getting a PhysX* Destructible Mesh (DM) working setup in an Unreal Engine* 4 (UE4*) project. This guide is primarily based on personal trial and error; other methods may exist that work better for your project. See official documentation for tutorials on fracturing and troubleshooting if you would like to go more in depth with Destructive Mesh capabilities.
Realistic cloth movement can bring a great amount of visual immersion into a game. Using PhysX* Clothing* is one way to do this without the need of hand animating. Incorporating these simulations into Unreal Engine* 4 is easy, but as it is a taxing process on the CPU, it’s good to understand their performance characteristics and how to optimize them.