Intel® HPC Developer Conference 2017
Artificial intelligence (AI) is unlocking tremendous economic value across various sectors. Data scientists can use several open source frameworks and basic hardware resources during very initial investigative phases. This talk details what's been done and future plans to democratize AI.
Andres Rodriguez, Intel
Inside the Altair HyperWorks* computer-aided engineering (CAE) suite, RADIOSS is the tool primarily used for crash and safety simulations. This presentation explains how the code has been optimized for new Intel® Xeon® Scalable processors and Intel® Xeon Phi™ processors, leveraging Intel® AVX512 instruction set to improve solver speed.
Eric Lequiniou, Altair*
Get a brief introduction to ANSYS Fluent software parallelization and HPC practices and statuses followed by optimization of Fluent on Intel® processors, with investigations and results.
Pavan Mutnuri and Rongguang Jia, ANSYS Inc
Walk through some of the real-life software engineering challenges encountered throughout the development stages of refactoring an aging HPC codebase. Learn about some of the obstacles and the strategies used to resolve them.
Jon Povich, Convergent Science
In this talk, we review different approaches to design linear solvers—the workhorse in many simulators— that are robust, performant, and scalable on the new manycore architectures, such as Intel® Xeon Phi™ processors, while taking into account the specifics of the hardware.
Tom Jonsthovel, Schlumberger
Federated learning approaches have recently made AI on edge devices desirable and necessary for the evolution of ubiquitous artificial intelligence. This session presents emerging AI technologies, their feasibility on edge, and their role in federated learning.
Sherine Abdelhak and Manuj Sabharwal, Intel
TensorFlow* is a leading machine learning and deep learning framework that enables data scientists to address problems on a variety of devices ranging from multicore CPUs to custom ASICs (TPUs). We share how optimizing TensorFlow has resulted in an up to 85 times the speedup on common neural network models.
Vivek Rane, Intel
Deep learning and AI is revolutionizing how we work and interact with technology. As model sizes grow, the ability to adapt for fast power efficiency will limit performance scalability. We explore technical and practical feasibility of low-precision deep neural networks.
Eriko Nurvitadhi , Kevin Nealis, and Philip Colangelo, Intel
Cray Urika-XC* brings a suite of analytics and deep learning software to Urika-XC users. This talk presents an overview of Urika-XC and its productivity and scaling benefits. We demonstrate these benefits with a full deep learning workflow that uses Apache Spark* and BigDL to predict rainfall.
Kristyn Maschhoff, Ananda Kommaraju, Alex Heye, and Michael Ringenburg, Cray Inc.
We present the first 15 petaFLOPS deep learning system for solving supervised and semi-supervised scientific pattern classification problems, optimized for Intel® Xeon Phi™ processors. We use a hybrid of synchronous and asynchronous training to scale to approximately 9600 nodes of Cori on convolutional neural networks (CNN) and autoencoder networks.
Narayanan Sundaram, Intel
Thorsten Kurth, Lawrence Berkeley National Laboratory
We present a novel deep learning pipeline for using genetic variant data to predict patient risk for several clinical phenotypes. We compare our approach to standard methods in the literature, and discuss performance and optimization of our approach with TensorFlow on Intel® architecture.
Kyle Ambert and Sandeep Gupta, Intel
Ali Torkamani, Scripps Translational Science Institute
Learn how to accelerate deep learning inference and training on manycore and multicore CPUs using stand-alone frameworks like Caffe or TensorFlow. Eliminate the need for the GPU by achieving superior inference performance with an 8-core Intel® Xeon® processor with real-world automotive and neuroscience examples.
Victor Jakubiuk, OnSpecta, Inc.
The Intel® Nervana™ Graph project provides a hardware independent intermediate representation (IR) for deep learning frameworks with efficiency. Offering connectors to TensorFlow, Intel’s reference framework neon™ with backends for compiling, executing IR on CPUs, GPUs, and future deep learning accelerators.
Jason Knight, Intel
Scalp electroencephalogram extracts functional connectivity patterns, and information for classification and analysis for interictal epileptiform discharges. This session discusses the multivariate autoregressive (MVAR) model, and information flow between identified neuronal systems used as a parameter for machine and deep learning classification algorithms.
Panuwat Janwattanapong, Florida International University
We discuss anomaly detection and neural network methods.
Ole Mengshoel, Aniruddha Basak, and Tong Yu, Carnegie Mellon University
Modern parallel computing techniques and optimizations for Intel® Xeon Phi™ processors have allowed dramatic acceleration of computations to characterize neural circuits of the brain. We show how HPC and AI can impact clinical care.
Simon Warfield and Benoit Scherrer, Boston Children's Hospital and Harvard Medical School
As a scientist at Los Alamos National Laboratory, the speaker created a near-linear scalable mapping during the 1980s that has run on most leadership class supercomputers using tens of thousands of nodes and delivers PF/s training performance.
Robert Farber, TechEnablement
See how Descartes Labs created a cloud-based supercomputing platform for the application of machine intelligence to massive data sets. Capitalizing on the confluence of advances in AI and HPC in the cloud, they created an enterprise data refinery for satellite imagery on a global scale.
Mike Warren, Descartes Labs
Although TensorFlow supports multicore CPUs, evaluation of the default CPU backend reveals suboptimal performance. In this talk, we describe a collaborative effort between Intel and Google engineers to optimize TensorFlow for modern x86 systems, resulting in speedups of up to 85 times on common neural network models over the default CPU backend. We demonstrate the capability of Intel® Optimization for TensorFlow* on the latest Intel® Xeon® and Intel® Xeon Phi™ processors.
Amrita Mathuriya, Intel
Training models with high accuracy on large image datasets can typically take weeks or months. We present scale-out results that suggest one can achieve this much faster, such as training the Resnet-50 architecture on Imagenet-1K using either Intel® Xeon® or Intel® Xeon Phi™ processor nodes.
Vikram Saletore, Intel
Valeriu Codreanu and Damian Podareanu, SURFsara B.V.
Deep learning frameworks provide good performance on a single workstation, but scaling across multiple nodes is less understood and evolving. This introductory lecture helps to explain the key steps to enable deep learning capabilities in your existing HPC system without additional hardware.
Jeff Adams, Intel
Machine learning has inspired novel data analysis techniques in experiments such as the Large Hadron Collider of the European Organization for Nuclear Research (CERN). In this work, we handle the problem of boosted jet classification in high-energy physics by using artificial neural networks.
José Cupertino Ruiz Vargas, Raphael Cóbe, and Silvio Stanzani, Center for Scientific Computing (NCC/UNESP)
In this presentation, we describe the design of a fully scalable fixed-point DSP MACs with rounding and saturation. We used the MACs for the neurons within the CNN (convolutional neural network) in sequential and parallel structure, and evaluate the corresponding performance.
Charles Chang Choo, San Jose State University, FPGA Lab
Byung-Joo Kim, Mando Innovations, Silicon Valley
This session discusses a flow graph and extension to the Intel® Threading Building Blocks interface used for coordinating layers for heterogeneity to retain optimization opportunities and composing with existing models. We also discuss expressing complex synchronization, communication patterns and in balancing the load between CPUs, GPUs, and FPGAs.
Hatem Ltaief and Kadir Akbudak, KAUST
In this session, see how three Intel® technologies can be used in machine learning and deep learning:
Stephen Blair-Chappell, Bayncore
Recent advances in on-demand HPC in cloud and deep learning are fueling adoption of HPC in mainstream enterprises beyond traditional scientific domains. This session covers best practices for building on-demand HPC solutions for enterprise workloads.
Geeta Chauhan, SVSG