Dr. Ole J. Mengshoel is a Principal Systems Scientist in the ECE Department at CMU Silicon Valley. His current research focuses on: scalable computing in artificial intelligence and machine learning; stochastic optimization; and applications of artificial intelligence and machine learning.
Dr. John Paul Shen was a Nokia Fellow and the founding director of Nokia Research Center - North America Lab. After spending 15 years in the industry, all in the Silicon Valley, he returned to CMU in the fall of 2015 as a tenured Full Professor in the ECE Department, based at the CMU Silicon Valley campus.
Scalability of artificial Intelligence (AI) and machine learning (ML) algorithms, methods, and software has been an important research topic for a while. In ongoing and future work at CMU Silicon Valley, we take advantage of opportunities that have emerged due to recent dramatic improvements in parallel and distributed hardware and software. With the availability of Big Data, powerful computing platforms ranging from small (smart phones, wearable computers, IoT devices) to large (elastic clouds, data centers, supercomputers), as well as large and growing business on the Web, the importance and impact of scalability in AI and ML is only increasing. We will now discuss a few specific results and projects.
In the area of parallel and distributed algorithms, we have developed parallel algorithms and software for junction tree propagation, an algorithm that is a work-horse in commercial and open-source software for probabilistic graphical models. On the distributed front, we have developed and are developing MapReduce-based algorithms for speeding and scaling up learning of Bayesian networks as well as anomaly detection from complete and incomplete data, and experimentally demonstrated their benefits using Apache Hadoop* and Apache Spark*. Finally, we have an interest in matrix factorization (MF) for recommender systems on the Web, and have developed an incremental MF algorithm that can take advantage of Spark. Large-scale recommender and machine learning systems, which are currently essential components of many Web sites, can benefit from incremental methods since they adapt more quickly to varying customer choices compared to traditional batch methods, while retaining high accuracy.
Caffe* is a deep learning framework - originally developed at the Berkeley Vision and Learning Center. Recently, Caffe2, a successor to Caffe, has been officially released. Facebook has been the driving force in developing the open source Caffe*2 framework. TensorFlow*, supported by several companies including Google and Intel, is another scalable deep learning framework that we have used in projects. In our hands-on machine learning experience with Caffe2, we have found it to support rapid prototyping and experimentation, simple compilation, and better portability than earlier versions of Caffe. We also have strong results with TensorFlow*, for example in human activity recognition projects using deep learning.
We are experimenting with Intel’s PyLatte machine earning library, which is written in Python* and is optimized for Intel CPUs. Goals of PyLatte includes ease of programming, high productivity, high performance, and leveraging the power of CPUs. A CMU SV project has focused on implementation of speech recognition and image classification models using PyLatte, using deep learning with neural networks. In speech recognition experiments, we have found PyLatte to be ease to use, with a flexible training step and short training time.
We look forward to continuing to develop parallel, distributed, and incremental algorithms for scalable intelligent models and systems as an Intel® Parallel Computing Center at CMU Silicon Valley. We create novel algorithms, models, and applications that utilize novel hardware and software computing platforms including multi- and many-core computers, cloud computing, MapReduce, Apache Hadoop* and Apache Spark*.
- Ole Mengshoel, Aniruddha Basak, and Tong Yu, November 2017, Machine Learning with HPC: Optimizing for Big Data, Accuracy, and Response Time, HPCDevCon17