Nervana has joined Intel
Nervana is excited to share a series of short videos and accompanying exercises to learn how to build deep learning models with neon™, our deep learning framework. We start with a basic introduction into deep learning concepts, provide an overview of the neon framework, and discuss key neon concepts such as loading data and defining branching architectures. This will be a living series, so check back for more updates and videos.
You can also find more resources, including pre-trained models, Kaggle challenge scripts, videos from our meetups, and more here.
This video introduces the basic deep learning concepts necessary to both understand the neon codebase and build your own deep learning models. We discuss how deep learning is different from traditional machine learning, and cover basic concepts such as: supervised learning, backpropagation, stochastic gradient descent, activation functions, and the basic linear unit.https://www.youtube.com/watch?v=oEGGr2K_v_4
For sequence data such as speech or text, recurrent neural networks (RNNs) are often used to capture the short and long term temporal dependencies in the data. Training RNNs is challenging because of the vanishing gradient problem. We introduce the Long-Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks that are designed to combat the vanishing gradient problem.
For images and other data where ordering in the spatial dimensions have meaning, convolutional neural networks have proven to be effective networks. In this video, we discuss 1D, 2D, and 3D convolutional networks, and review recent CNN architectures that have enabled deeper and more powerful models (VGG, ResNet, etc.).
The neon deep learning framework provides an easy python-based approach to getting started with deep learning. Here we introduce the basic modules within neon and how to construct models and use our command line arguments to customize training runs. We recommend viewing this video before trying our jupyter notebooks. The MNIST Example and Fine-tuning VGG notebooks below are useful companions.https://www.youtube.com/watch?v=-c1fq0hFU-0
In this video, we discuss two key concepts within neon: loading data into neon, and defining complex branching architectures. neon provide four different ways to load your data for training, depending on your data size and complexity. Several notebooks guide you through writing a custom dataset object, custom activation functions and layers, custom callbacks, and defining a complex branching model.
Some of our notebooks require GPUs because of memory and speed constraints. Our Nervana Cloudprovides an easy interface to launch training jobs on our GPU servers. Trained models can also be deployed on a server to receive incoming inference requests via a REST API. This video demonstrates how to launch jobs, inspect progress, and deploy a trained job for inference.
One of our popular cloud features is interactive mode, where users can launch a jupyter notebook server running on our GPUs and access the notebook through their web browser to interactively step through code for debugging or exploration.https://www.youtube.com/watch?v=Zjh4ek5a69M
The above videos are accompanied by several jupyter notebooks found at https://github.com/NervanaSystems/neon_course that are guided exercises through key concepts in neon and common operations.
The jupyter notebooks in this repository include:
Comprehensive walk-through of how to use neon to build a simple model to recognize handwritten digits. Recommended as an introduction to the neon framework.
A popular application of deep learning is to load a pre-trained model and fine-tune on a new dataset that may have a different number of categories. This example walks through how to load a VGG model that has been pre-trained on ImageNet, a large corpus of natural images belonging to 1000 categories, and re-train the final few layers on the CIFAR-10 dataset, which has only 10 categories.
neon provides many built-in methods for loading data from images, videos, audio, text, and more. In the rare cases where you may have to implement a custom dataset object,his notebooks guides users through building a custom dataset object for a modified version of the Street View House Number (SVHN) dataset. Users will not only write a custom dataset, but also design a network to, given an image, draw a bounding box around the digit sequence.
This notebook walks developers through how to implement custom activation functions and layers within neon. We implement the Affine layer, and demonstrate the speed-up difference between using a python-based computation and our own heavily optimized kernels.
When simple sequential lists of layers do not suffice for your complex models, we present how to build complex branching models within neon.
In neon, models are constructed as python lists, which makes it easy to use for-loops to define complex models that have repeated patterns, such as deep residual networks. This notebook is an end-to-end walkthrough of building a deep residual network, training on the CIFAR-10 dataset, and then applying the model to predict categories on novel images.
Callbacks allow models to report back to users its progress during training. In this notebook, we present a callback that plots training cost in real-time within the jupyter notebook.
Overfitting is often encountered when training deep learning models. This tutorial demonstrates how to use our visualization tools to detect when a model has overfit on the training data, and how to apply Dropout layers to correct the problem.
Hanlin Tang has applied machine learning and data science across multiple domains, including computational neuroscience, proteomics, and defense policy. Prior to his graduate work on recurrent neural networks for object recognition, he built models and performed analyses for several defense and intelligence agencies while at the RAND Corporation.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804