Intel® HPC Developer Conference 2017

Artificial Intelligence

Enable the Future of Artificial Intelligence
(22 min)

Artificial intelligence (AI) is unlocking tremendous economic value across various sectors. Data scientists can use several open source frameworks and basic hardware resources during very initial investigative phases. This talk details what's been done and future plans to democratize AI.

Andres Rodriguez, Intel

Presentation (PDF)


Assessment of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) to Speed Up Crash Simulations with Altair RADIOSS*

Inside the Altair HyperWorks* computer-aided engineering (CAE) suite, RADIOSS is the tool primarily used for crash and safety simulations. This presentation explains how the code has been optimized for new Intel® Xeon® Scalable processors and Intel® Xeon Phi™ processors, leveraging Intel® AVX512 instruction set to improve solver speed.

Eric Lequiniou, Altair*

Presentation (PDF)





Emerging AI Technologies on Intel® Platforms
(23 min)

Federated learning approaches have recently made AI on edge devices desirable and necessary for the evolution of ubiquitous artificial intelligence. This session presents emerging AI technologies, their feasibility on edge, and their role in federated learning.

Sherine Abdelhak and Manuj Sabharwal, Intel

Presentation (PDF)



Low-Precision Neural Networks with FPGAs
(25 min)

Deep learning and AI is revolutionizing how we work and interact with technology. As model sizes grow, the ability to adapt for fast power efficiency will limit performance scalability. We explore technical and practical feasibility of low-precision deep neural networks.

Eriko Nurvitadhi , Kevin Nealis, and Philip Colangelo, Intel

Presentation (PDF)


Scalable Deep Learning with BigDL on Cray Urika-XC*
(21 min)

Cray Urika-XC* brings a suite of analytics and deep learning software to Urika-XC users. This talk presents an overview of Urika-XC and its productivity and scaling benefits. We demonstrate these benefits with a full deep learning workflow that uses Apache Spark* and BigDL to predict rainfall.

Kristyn Maschhoff, Ananda Kommaraju, Alex Heye, and Michael Ringenburg, Cray Inc.

Presentation (PDF)


Deep Learning at 15 PetaFLOPS: Supervised and Semi-Supervised Classification for Scientific Data

We present the first 15 petaFLOPS deep learning system for solving supervised and semi-supervised scientific pattern classification problems, optimized for Intel® Xeon Phi™ processors. We use a hybrid of synchronous and asynchronous training to scale to approximately 9600 nodes of Cori on convolutional neural networks (CNN) and autoencoder networks.

Narayanan Sundaram, Intel
Thorsten Kurth, Lawrence Berkeley National Laboratory

Presentation (PDF)


Predicting Disease with Deep Learning on TensorFlow from Genetic Variant Data
(22 min)

We present a novel deep learning pipeline for using genetic variant data to predict patient risk for several clinical phenotypes. We compare our approach to standard methods in the literature, and discuss performance and optimization of our approach with TensorFlow on Intel® architecture.

Kyle Ambert and Sandeep Gupta, Intel
Ali Torkamani, Scripps Translational Science Institute

Presentation (PDF)


Fast Deep Learning with Caffe* and TensorFlow on Multicore CPUs
(23 min)

Learn how to accelerate deep learning inference and training on manycore and multicore CPUs using stand-alone frameworks like Caffe or TensorFlow. Eliminate the need for the GPU by achieving superior inference performance with an 8-core Intel® Xeon® processor with real-world automotive and neuroscience examples.

Victor Jakubiuk, OnSpecta, Inc.

Presentation (PDF)


Intel® Nervana™ Graph: A Universal Deep Learning Compiler
(23 min)

The Intel® Nervana™ Graph project provides a hardware independent intermediate representation (IR) for deep learning frameworks with efficiency. Offering connectors to TensorFlow, Intel’s reference framework neon™ with backends for compiling, executing IR on CPUs, GPUs, and future deep learning accelerators.

Jason Knight, Intel

Presentation (PDF)


Analysis of Effective Connectivity Using Python*
(21 min)

Scalp electroencephalogram extracts functional connectivity patterns, and information for classification and analysis for interictal epileptiform discharges. This session discusses the multivariate autoregressive (MVAR) model, and information flow between identified neuronal systems used as a parameter for machine and deep learning classification algorithms.

Panuwat Janwattanapong, Florida International University

Presentation (PDF)



Accelerated Characterization of Neural Circuits of the Brain
(29 min)

Modern parallel computing techniques and optimizations for Intel® Xeon Phi™ processors have allowed dramatic acceleration of computations to characterize neural circuits of the brain. We show how HPC and AI can impact clinical care.

Simon Warfield and Benoit Scherrer, Boston Children's Hospital and Harvard Medical School

Presentation (ZIP)




Experiences of Scaling TensorFlow on up to 512 Nodes on a CORI Supercomputer
(24 min)

Although TensorFlow supports multicore CPUs, evaluation of the default CPU backend reveals suboptimal performance. In this talk, we describe a collaborative effort between Intel and Google engineers to optimize TensorFlow for modern x86 systems, resulting in speedups of up to 85 times on common neural network models over the default CPU backend. We demonstrate the capability of Intel® Optimization for TensorFlow* on the latest Intel® Xeon® and Intel® Xeon Phi™ processors.

Amrita Mathuriya, Intel

Presentation (PDF)


Achieving Deep Learning Training in Less Than 40 Minutes on ImageNet-1K
(25 min)

Training models with high accuracy on large image datasets can typically take weeks or months. We present scale-out results that suggest one can achieve this much faster, such as training the Resnet-50 architecture on Imagenet-1K using either Intel® Xeon® or Intel® Xeon Phi™ processor nodes.

Vikram Saletore, Intel
Valeriu Codreanu and Damian Podareanu, SURFsara B.V.

Presentation (ZIP)



Machine Learning for Boosted Jet Classification in High-Energy Physics
(20 min)

Machine learning has inspired novel data analysis techniques in experiments such as the Large Hadron Collider of the European Organization for Nuclear Research (CERN). In this work, we handle the problem of boosted jet classification in high-energy physics by using artificial neural networks.

José Cupertino Ruiz Vargas, Raphael Cóbe, and Silvio Stanzani, Center for Scientific Computing (NCC/UNESP)

Presentation (PDF)


Performance Evaluation of FPGA-Based Bit-Scaled CNN Architecture
(25 min)

In this presentation, we describe the design of a fully scalable fixed-point DSP MACs with rounding and saturation. We used the MACs for the neurons within the CNN (convolutional neural network) in sequential and parallel structure, and evaluate the corresponding performance.

Charles Chang Choo, San Jose State University, FPGA Lab
Byung-Joo Kim, Mando Innovations, Silicon Valley

Presentation (PDF)


High-Performance Machine Learning for Weather Prediction Applications
(21 min)

This session discusses a flow graph and extension to the Intel® Threading Building Blocks interface used for coordinating layers for heterogeneity to retain optimization opportunities and composing with existing models. We also discuss expressing complex synchronization, communication patterns and in balancing the load between CPUs, GPUs, and FPGAs.

Hatem Ltaief and Kadir Akbudak, KAUST

Presentation (PDF)



Best Practices for On-Demand HPC in Enterprises
(25 min)

Recent advances in on-demand HPC in cloud and deep learning are fueling adoption of HPC in mainstream enterprises beyond traditional scientific domains. This session covers best practices for building on-demand HPC solutions for enterprise workloads.

Geeta Chauhan, SVSG

Presentation (PDF)