The Parallel Universe Magazine


Issue 33

Advance AI on Apache Spark* with BigDL: Features, Use Cases, and the Future

BigDL has evolved into a vibrant open source project since Intel introduced it in December of 2016. This issue provides details on its implementation, describes some real-world use-cases, and offers a glimpse into the new end-to-end analytics plus AI pipelines (the Analytics Zoo platform) being built on top of Apache Spark* and BigDL.

Read This Issue

34 Search Results

Parallel Universe Magazine - Issue 33, July 2018


  • Letter from the Editor: What's the Big Deal about BigDL? by Henry A. Gabb
  • Advancing Artificial Intelligence on Apache Spark* with BigDL by Jason Dai and Radhika Rangarajan
    Features, use-cases, and the future.
  • Why WebAssembly Is the Future of Computing on the Web by Rich Winterton, Deepti Aggarwal, Tuyet-Trang (Snow), Lam Piel, Brittney Coons, and Nathan Johns
    The history and new direction of processing in the browser.
  • Code Modernization in Action: Threading, Memory, and Vectorization Optimizations by Dmitry Prohorov, Cedric Andreolli, and Philippe Thierry
    Using the latest Intel® Software Development Tools to make more efficient use of hardware.
  • In-Persistent Memory Computing with Java* by Eric Kaczmarek and Preetika Tyagi
    The key to adaptability in modern application programming.

  • Faster Gradient-Boosting Decision Trees by Ying Hu, Oleg Kremnyov, and Ivan Kuzmin
    How to lift machine learning performance using Intel® Data Analytics Acceleration Library (Intel® DAAL).
  • Hiding Communication Latency Using MPI-3 Non-Blocking Collectives by Amarpal Singh Kapoor, Rama Kishan Malladi, Nitya Hariharan, and Srinivas Sridharan
    Improving HPC performance by overlapping communication and computation.

Parallel Universe Magazine - Issue 32, March 2018


  • Letter from the Editor: Computer Vision Coming Soon to a Browser Near You by Henry A. Gabb
  • Computer Vision for the Masses by Sajjad Taheri, Alexeandru Nicolau, Alexeander Vedienbaum, Ningxin Hu, and Mohammad Reza Haghighat
    Bringing computer vision to the Open Web Platform*.
  • Up Your Game by Giselle Gomez
    How to optimize your game development―no matter what your role―using Intel® Graphics Performance Analyzers.
  • Harp-DAAL for High-Performance Big Data Computing by Judy Qiu
    The key to simultaneously boosting productivity and performance.
  • Understanding the Instruction Pipeline by Alex Shinsel
    The key to adaptability in modern application programming,
  • Parallel CFD with the HiFUN* Solver on the Intel® Xeon® Scalable Processor by Rama Kishan Malladi, S.V. Vinutha, and Austin Cherian
    Maximizing HPC platforms for fast numerical simulations.
  • Improving VASP* Materials Simulation Performance by Fedor Vasilev, Dmitry Sivkov, and Jeongnim Kim
    Using the latest Intel® Software Development Tools to make more efficient use of hardware.

Parallel Universe Magazine - Issue 31, January 2018


  • Letter from the Editor: Happy New Year, Happy Parallel Computing, by Henry A. Gabb
    Henry A. Gabb is a longtime high-performance and parallel computing practitioner who has published numerous articles on parallel programming.
  • FPGA Programming with the OpenCL™ Platform, by James Reinders and Tom Hill
    Knowing how to program an FPGA is a skill you need―and here’s how to start.
  • Accelerating the Eigen Math Library for Automated Driving Workloads, by Steena Monteiro and Gaurav Bansal
    Meeting the need for speed with Intel® Math Kernel Library.
  • Speeding Algebra Computations with the Intel® Math Kernel Library Vectorized Compact Matrix Functions, by Kirana Bergstrom, Eugene Chereshnev, and Timothy B. Costa
    Maximizing the performance benefits of the compact data layout.
  • Boosting Java* Performance in Big Data Applications, by Kumar Shiv and Rahul Kandu
    How new enhancements enable faster and better numerical computing.
  • Gaining Performance Insights Using the Intel® Advisor Python* API, by Kevin O’Leary and Egor Kazachkov
    Getting good data to make code tuning decisions.
  • Welcome to the Intel® AI Academy, by Niven Singh
    AI education for all.

Parallel Universe Magazine - Issue 30, October 2017


  • Letter from the Editor: Meet Intel® Parallel Studio XE 2018, by Henry A. Gabb
    Henry A. Gabb is a long-time high-performance and parallel computing practitioner and has published numerous articles on parallel programming.
  • Driving Code Performance with Intel® Advisor’s Flow Graph Analyzer, by Vasanth Tovinkere, Pablo Reble, Farshad Akhbari, and Palanivel Guruvareddiar
    Optimizing performance for an autonomous driving application.
  • Welcome to the Adult World, OpenMP*, by Barbara Chapman
    After 20 years, it’s more relevant than ever.
  • Enabling FPGAs for Software Developers, by Bernhard Friebe, and James Reinders
    Boosting efficiency and performance for automotive, networking, and cloud computing.
  • Modernize Your Code for Performance, Portability, and Scalability, by Jackson Marusarz
    What’s new in Intel® Parallel Studio XE.
  • Dealing with Outliers, by Oleg Kremnyov, Mikhail Averbukh, and Ivan Kuzmin
    How to find fraudulent transactions in a real-world dataset.
  • Tuning for Success with the Latest SIMD Extensions and Intel® Advanced Vector Extensions 512, by Xinmin Tian, Hideki Saito, Sergey Kozhukhov, and Nikolay Panchenko
    Best practices for taking advantage of the latest architectural features.
  • Effectively Using Your Whole Cluster, by Rama Kishan Malladi
    Optimizing SPECFEM3D_GLOBE* performance on Intel® architecture.
  • Is Your Cluster Healthy?, by Brock A. Taylor
    Must-have cluster diagnostics in Intel® Cluster Checker.
  • Optimizing HPC Clusters, by Michael Hebenstreit
    Enabling on-demand BIOS configuration changes in HPC clusters.

Parallel Universe Magazine - Issue 29, July 2017


  • Letter from the Editor: Old and New, by Henry A. Gabb
    Henry A. Gabb is a longtime high-performance and parallel computing practitioner and has published numerous articles on parallel programming.
  • Tuning Autonomous Driving Using Intel® System Studio, by Lavanya Chockalingam
    Intel® GO™ Automotive SDK offers automotive solution developers an integrated solutions environment.
  • OpenMP* Is Turning 20!, by Bronis R. de Supinski
    Making parallel programming accessible to C/C++ and Fortran programmers.
  • Julia*: A High-Level Language for Supercomputing, by Ranjan Anantharaman, Viral Shah, and Alan Edelman
    The Julia Project continues to break new boundaries in scientific computing.
  • Vectorization Becomes Important—Again, by Robert H. Dodds Jr.
    Open source code WARP3D exemplifies renewed interest in vectorization.
  • Building Fast Data Compression Code for Cloud and Edge Applications, by Chao Yu and Sergey Khlystov
    How to optimize your compression with Intel® Integrated Performance Primitives (Intel® IPP).
  • MySQL* Optimization with Intel® C++ Compiler, by Huixiang Tao, Ying Hu, and Ming Gao
    Leverage MySQL* to deliver peak service.
  • Accelerating Linear Regression in R* with Intel® DAAL, by Steena Monteiro and Shaojuan Zhu
    Make better predictions with this highly optimized open source package.

Parallel Universe Magazine - Issue 28, April 2017


  • Letter from the Editor: Parallel Languages, Language Extensions, and Application Frameworks, by Henry A. Gabb
    Henry A. Gabb is a long-time high-performance and parallel computing practitioner and has published numerous articles on parallel programming.
  • Parallel STL: Boosting Performance of C++ STL Code, by Vladimir Polin and Mikhail Dvorskiy
    C++ and the evolution toward natively parallel languages.
  • Happy 20th Birthday, OpenMP*, by Rob Farber
    Making parallel programming accessible to C/C++ and Fortran programmers—and providing a software path to exascale computation.
  • Solving Real-World Machine Learning Problems with Intel® Data Analytics Acceleration Library, by Oleg Kremnyov, Ivan Kuzmin, and Gennady Fedorov
    Models are put to the test in Kaggle* competitions.
  • HPC with R*: The Basics, by Drew Schmidt
    Satisfying the need for speed in data analytics,
  • BigDL: Optimized Deep Learning on Apache Spark*, by Jason Dai and Radhika Rangarajan
    Making deep learning more accessible,

Parallel Universe Magazine - Issue 27, January 2017


  • Letter from the Editor: The Changing HPC Landscape Still Looks the Same, by Henry A. Gabb
    Henry A. Gabb is a long-time high-performance and parallel computing practitioner and has published numerous articles on parallel programming.
  • The Present and Future of the OpenMP* API Specification, by Michael Klemm, Alejandro Duran, Ravi Narayanaswamy, Xinmin Tian, and Terry Wilmarth
    How the gold standard parallel programming language has improved with each new version.
  • Reducing Packing Overhead in Matrix-Matrix Multiplication, by Kazushige Goto, Murat Efe Guney, and Sarah Knepper
    Improve performance on multicore and many-core Intel® architectures, particularly for deep neural networks.
  • Identify Scalability Problems in Parallel Applications, by Vladimir Tsymbal
    How to improve scalability for Intel® Xeon® and Intel® Xeon Phi™ Processors using new Intel® VTune™ Amplifier memory analysis.
  • Vectorization Opportunities for Improved Performance with Intel® AVX-512, by Martyn Corden
    Examples of how Intel® Compilers can vectorize and speed up loops.
  • Intel® Advisor Roofline Analysis, by Kevin O’Leary, Ilyas Gazizov, Alexandra Shinsel, Zakhar Matveev, and Dmitry Petunin
    A new way to visualize performance optimization trade-offs.
  • Intel-Powered Deep Learning Frameworks, by Pubudu Silva
    Your path to deeper insights.

Parallel Universe Magazine - Issue 26, October 2016


  • Letter from the Editor: What Will Machines Learn from You?, by Mike Lee
  • Modernize Your Code for Intel® Xeon Phi™ Processors, by Yolanda Chen and Udit Patidar
    Explore new Intel® Parallel Studio XE 2017 capabilities
  • Unleash the Power of Big Data Analytics and Machine Learning, by Vadim Pirogov, Ivan Kuzmin, and Sarah Knepper
    Solve big data era application challenges with Intel® Performance Libraries.
  • Overcome Python* Performance Barriers for Machine Learning, by Vasily Litvinov, Viktoriya Fedotova, Anton Malakhov, Aleksei Fedotov, Ruslan Israfilov, and Christopher Hogan
    Accelerate and optimize Python* machine learning applications.
  • Profiling Java* and Python* Code using Intel® VTune™ Amplifier, by Sukruv Hv
    Get more CPU capability for Java*- and Python*-based applications
  • Lightning-Fast R* Machine Learning Algorithms, by Zhang Zhang
    Get results with the Intel® Data Analytics Acceleration Library and the latest Intel® Xeon Phi™ processor
  • A Performance Library for Data Analytics and Machine Learning, by Shaojuan Zhu
    See how the Intel® Data Analytics Acceleration Library impacts C++ coding for handwritten digit recognition.
  • MeritData Speeds Up its Tempo* Big Data Platform Using Intel® High-Performance Libraries, by Jin Qiang, Ying Hu, and Ning Wang
    Case study finds performance improvements and potential for big data algorithms and visualization.

Parallel Universe Magazine - Issue 25, June 2016


  • Letter from the Editor: Democratization of HPC, by James Reiders
    James Reinders, an expert on parallel programming, is coauthor of the new Intel® Xeon Phi™ Processor High Performance Programming—Knights Landing Edition.
  • Supercharging Python* with Intel and Anaconda* for Open Data Science, by Travis Oliphant
    The technologies that promise to tackle Big Data challenges.
  • Getting Your Python* Code to Run Faster Using Intel® VTune™ Amplifier XE, by Kevin O’Leary
    Providing line-level profiling information with very low overhead.
  • Parallel Programming with Intel® MPI Library in Python*, by Artem Ryabov and Alexey Malhanov
    Guidelines and tools for improving performance.
  • The Other Side of the Chip, by Robert Ioffe
    Using Intel® Processor Graphics for Compute with OpenCL™.
  • A Runtime-Generated Fast Fourier Transform for Intel® Processor Graphics, by Dan Petre, Adam T. Lake, and Allen Hux
    Optimizing FFT without increasing complexity.
  • Indirect Calls and Virtual Functions Calls: Vectorization with Intel® C/C++ 17.0 Compilers, by Hideki Saito, Serge Preis, Sergey Kozhukhov, Xinmin Tian, Clark Nelson, Jennifer Yu, Sergey Maslov, and Udit Patidar
    The newest Intel® C++ Compiler introduces support for indirectly calling a SIMD-enabled function in a vectorized fashion.
  • Optimizing an Illegal Image Filter System, by Yueqiang Lu, Ying Hu, and Huaqiang Wang
    Tencent doubles the speed of its illegal image filter system using a SIMD instruction set and Intel® Integrated Performance Primitives.

Parallel Universe Magazine - Special Issue, June 2016


  • Letter from the Editor: From Hatching to Soaring: Intel® TBB, by James Reinders
    James Reinders, an expert on parallel programming, is coauthor of the new Intel® Xeon Phi™ Processor High Performance Programming – Knights Landing Edition (June 2016), and coeditor of the recent High Performance Parallel Programming Pearls Volumes One and Two (2014 and 2015).
  • The Genesis and Evolution of Intel® Threading Building Blocks, by Arch D. Robison
    A decade after the introduction of Intel Threading Building Blocks, the original architect shares his perspective.
  • A Tale of Two High-Performance Libraries, by Vipin Kumar E.K.
    How Intel® Math Kernel Library and Intel® Threading Building Blocks work together to improve performance.
  • Heterogeneous Programming with Intel® Threading Building Blocks, by Alexei Katranov, Oleg Loginov, and Michael Voss
    With new features, Intel® Threading Building Blocks can coordinate the execution of computations across multiple devices.
  • Preparing for a Many-Core Future, by Kevin O’Leary, Ben Langmead, John O’Neill, and Alexey Kukanov
    Johns Hopkins University adds multicore parallelism to increase performance of its Bowtie 2* application.
  • Leading and Following the C++ Standard, by Alexei Katranov
    Intel® Threading Building Blocks adheres tightly to the C++ standard where it can—and paves the way for supporting parallelism best.
  • Intel® Threading Building Blocks: Toward the Future, by Alexey Kukanov
    The architect of Intel® Threading Building Blocks shares thoughts on the opportunities ahead.

Parallel Universe Magazine - Issue 24, March 2016


  • Letter from the Editor, James Reinders
    Time-Saving Tips as Spring Begins in the Northern Hemisphere
  • Improve Productivity and Boost C++ Performance
    The new Intel® SIMD Data Layout Template library optimizes C++ code and helps improve SIMD efficiency.
  • Intel® C++ Compiler Standard Edition for Embedded Systems with Bi-Endian Technology
    Intel® C++ Compiler Standard Edition for Embedded Systems with Bi-Endian Technology helps developers looking to overcome platform lock-in.
  • OpenMP* API Version 4.5: A Standard Evolves
    OpenMP* version 4.5 is the next step in the standard’s evolution, introducing new concepts for parallel programming as well as additional features for offload programming.
  • Intel® MPI Library: Supporting the Hadoop* Ecosystem
    With data analytics breaking into the HPC world, the question of using MPI and big data frameworks in the same ecosystem is getting more attention.
  • Finding Your Memory Access Performance Bottlenecks
    The new Intel® VTune™ Amplifier XE Memory Access analysis feature shows how some tough memory problems can be resolved.
  • Optimizing Image Identification with Intel® Integrated Performance Primitives
    Intel worked closely with engineers at China’s largest and most-used Internet service portal to help them achieve a 100 percent performance improvement on the Intel® architecture-based platform.
  • Develop Smarter Using the Latest IoT and Embedded Technology
    A closer look at tools for coding, analysis, and debugging with all Intel® microcontrollers, Internet of Things (IoT) devices, and embedded platforms.
  • Tuning Hybrid Applications with Intel® Cluster Tools
    This article provides a step-by-step workflow for hybrid application analysis and tuning.
  • Vectorize Your Code Using Intel® Advisor XE 2016
    Vectorization Advisor boasts new features that can assist with vectorization on the next generation of Intel® Xeon Phi™ processors.

Parallel Universe Magazine - Issue 23, November 2015


  • Letter from the Editor, by James Reinders
    Computers “Think” More Like Humans, but They Still Need Us
  • Which Tool Do I Use? A Roadmap to Increasing Your Application’s Performance
    By using the correct tool at each phase of your performance tuning, you can greatly increase performance at lower cost.
  • Modernizing Code for Tomorrow’s HPC Problem-Solving
    Tips on code modernization, or increasing parallel programming, that have proven valuable for dedicated HPC software developers, domain specialists, and data scientists alike.
  • Get a Helping Hand from the Vectorization Advisor
    With Vectorization Advisor recommendations, the Hartree Centre was able to get an 18 percent speedup in their code
  • Optimizing Image Processing
    As China’s largest online direct sales company, handles several billion product images every day. By using Intel® software development tools, sped up its image processing 17x—handling 300,000 images in 162 seconds.
  • Boosting Speech Recognition Performance
    Qihoo360 Technology Co., Ltd., a Chinese Internet security company, collaborated with Intel to optimize its Euler* platform, which supports machine learning-related computation models for real businesses.
  • How Fortran Developers Can Boost Productivity with Submodules
    Submodules are now supported in Intel® Fortran Compiler Version 16.0.


Get The Latest Issue

Intel’s quarterly magazine helps you take your software development into the future with the latest tools, tips, and training to expand your expertise.


The benchmark results reported above may need to be revised as additional testing is conducted. The results depend on the specific platform configurations and workloads utilized in the testing, and may not be applicable to any particular user’s components, computer system, or workloads. The results are not necessarily representative of other benchmarks and other benchmark results may show greater or lesser impact from mitigations.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information, see Performance Benchmark Test Disclosure.