Back in 1993, the institute where I was doing my postdoctoral research got access to a Cray C90* supercomputer. Competition for time on this system was so fierce that we were told―in no uncertain terms―that if our programs didn’t take advantage of the architecture, they should run elsewhere. The C90 was a vector processor, so we had to vectorize our code. Those of us who took the time to read the compiler reports and make the necessary code modifications saw striking performance gains. Though vectorization would eventually take a backseat to parallelization in the multicore era, this optimization technique remains important. In Vectorization Becomes Important―Again, Robert H. Dodds Jr. (professor emeritus at the University of Illinois) shows how to vectorize a real application using modern programming tools.
We continue our celebration of OpenMP*’s 20th birthday with a guest editorial from Bronis R. de Supinski, chief technology officer of Livermore Computing and the current chair of the OpenMP Language Committee. Bronis gives his take on the conception and evolution of OpenMP as well as its future direction in OpenMP Is Turning 20! Though 20 years old, OpenMP continues to evolve with modern architectures.
I used to be a Fortran zealot, before becoming a Perl* zealot, and now a Python* zealot. Recently, I had occasion to experiment with a new productivity language called Julia*. I recoded some of my time-consuming data wrangling applications from Python to Julia, maintaining a line-for-line translation as much as possible. The performance gains were startling, especially because these were not numerically intensive applications, where Julia is known to shine. They were string manipulation applications to prepare data sets for text mining. I’m not ready to forsake Python and its vast ecosystem just yet, but Julia definitely has my attention. Take a look at Julia*: A High-Level Language for Supercomputing for an overview of the language and its features.
This issue’s feature article, Tuning Autonomous Driving Using Intel® System Studio, illustrates how the tools in Intel System Studio give embedded systems and connected device developers an integrated development environment to build, debug, and tune performance and power usage. Continuing the theme of tuning edge applications, Building Fast Data Compression Code for Cloud and Edge Applications shows how to use the Intel® Integrated Performance Primitives (Intel® IPP) to speed data compression.
As I mentioned in the last issue of The Parallel Universe, R* is not my favorite language. It is useful, however, and Accelerating Linear Regression in R* with Intel® Data Analytics Acceleration Library (Intel® DAAL) shows how data analytics applications in R can take advantage of the Intel® Data Analytics Acceleration Library. Finally, in MySQL* Optimization with Intel® C++ Compiler, we round out this issue with a demonstration of how Interprocedural Optimization significantly improves the performance of another important application for data scientists, the MySQL database.
Future issues of The Parallel Universe will contain articles on a wide range of topics, including persistent memory, IoT development, the Intel® Advanced Vector Extensions (Intel® AVX) instruction set, and much more. Stay tuned!