Welcome to another year of The Parallel Universe. 2020 promises to be interesting. Some readers may know that I used to dread the trend toward heterogeneous computing. Then I came to accept it as inevitable. Now, I’m embracing it. Sure, heterogeneity is both a blessing and a curse. The blessing is better performance and efficiency. The curse is increased complexity. My hope is that someday machines will just take care of it for me (see Why More Software Development Needs to Go to the Machines), but until then, practical steps are being taken to minimize this complexity, starting with oneAPI. oneAPI is an open specification that describes a single software abstraction across diverse compute architectures.
The Intel implementation of oneAPI was recently announced by Raja Koduri (senior vice president, chief architect, and general manager of Intel Architecture, Graphics, and Software) at the Intel® HPC Developer Conference. Our feature article, Heterogeneous Programming Using oneAPI, gives an overview of this unified, standards-based approach to heterogeneous computing. Accelerating Compression on Intel® FPGAs shows how Data Parallel C++ makes FPGAs more accessible. Continuing this theme of heterogeneity, Is Your Game GPU-Bound? shows you how to answer this question using analysis tools like Intel® Graphics Performance Analyzers.
In the last issue, I briefly covered composable threading in the Julia* programming language. Jameson Nash and Jeff Bezanson from Julia Computing, Inc., and Kiran Pamnany from Caltech were kind enough to provide a more detailed look at the New Threading Capabilities in Julia v1.3 for this issue. They walk through several code examples illustrating task parallelism using Julia.
We close this issue with three articles on data analytics. The first, Fast Gradient Boosting Tree Inference for Intel® Xeon® Processors, shows how to use the XGBoost* library to improve the performance of model predictions. If you recall, the feature article in our last issue covered performance improvements for XGBoost training. The second, K-means Acceleration with 2nd Generation Intel® Xeon® Scalable Processors, shows how to take advantage of optimizations in Intel® Distribution for Python* and the Intel® Data Analytics Acceleration Library to do k-means clustering. Finally, in Measuring Graph Analytics Performance, I discuss the right ways―and wrong ways―to do graph analytics benchmarking. However, the graph analytics landscape is large and varied, so please let me know if you disagree with my assertions.
Expect to see more articles on oneAPI in future issues. And, as always, don’t forget to check out Tech.Decoded for more information on Intel’s solutions for code modernization, visual computing, data center and cloud computing, data science, and systems and IoT development.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804