Intel & Industry Partners Showcase Advances for Data Scientists & Analytics Software Developers at the Strata Data Conference

Strata Data Lead image

Amazing customer presentations and demos showcased new capabilities and opportunities for data scientists, data engineers and analytics software developers at the Strata Data Conference/San Jose. At jam-packed technical sessions and exhibit demonstrations, industry leaders, including SAS, Cloudera and Intel highlighted what’s possible with the intersection of big data, artificial intelligence (AI), and on-premise and edge-to-cloud solutions.

SAS showcased the SAS* Viya* cloud-enabled, in-memory analytics platform running on Intel® Xeon® Scalable processors and Intel® Optane® memory. They demonstrated how SAS Viya software and Intel® technology deliver faster, more elastic, scalable and fault-tolerant processing for analytics. The solution supports programming in SAS and other languages like Python, R, Java and Lua, and support for on-premise or hybrid and public cloud environments. To learn more, check out the SAS Viya Developer Free Trial offer.

Experts from JD.com, Microsoft, the University of California (UC) Berkeley and UCSF joined Intel in sessions that highlighted the power of Intel’s open source BigDL software.  Intel has made significant technical contributions to the Apache Spark community over the years, including an Intel-led open source initiative called BigDL, a distributed library for deep learning applications. Unlike a number of other libraries for building deep learning applications, BigDL is native to Apache Spark. With BigDL, you can write deep learning applications as standard Spark programs that run on existing Spark or Hadoop* clusters. By using infrastructure already in place instead of deploying a new cluster with an unfamiliar architecture, BigDL accelerates time to value, reduces total cost of ownership (TCO), and improves ease of use.

BigDL enables developers and data scientists to build deep learning applications while leveraging their existing investments in Spark and Hadoop infrastructure. BigDL has strong support in the industry, including Microsoft* Azure, Cloudera, Amazon* Web Services (AWS), JD.com, Databricks, Cray, and GigaSpaces, among others.   

Strata Data Zhen F and Wei TFor instance, JD.com software development engineer Zhen Fan and Intel senior software engineer Wei Ting Chen presented, “Spark on Kubernetes: A case study from JD.com.” They explained how JD.com uses Spark on Kubernetes in a production environment, and why the company chose Spark on Kubernetes for its AI workloads.

A highlight of their talk was large-scale image feature extraction with BigDL software. JD.com has hundreds of millions of merchandise pictures stored in an open-source distributed database. Efficient data retrieval and processing on this large-scale, distributed infrastructure is a key requirement of their image feature extraction pipeline. BigDL, Apache Spark, and Intel Xeon Scalable processors deliver on that requirement.

Strata Data Shivaram and SergeyUC Berkeley-based Microsoft researcher Shivaram Venkataraman and Intel solutions architect Sergey Ermolin presented, “Accelerating deep learning on Apache Spark using BigDL with coarse-grained scheduling.” They outlined how a new parameter manager implementation together with coarse-grained scheduling can provide significant speed-ups for deep learning models using the BigDL framework at scale

 

Strata Data 4 speakers togetherIntel software engineer Jennie Wang and technical program manager Yulia Tell joined UCSF’s Valentina Pedoia and Berk Norman to present, “Automatic 3D MRI knee damage classification with 3D CNN using BigDL on Spark.” They explained how automatically classifying damage to the meniscus at the time of an MRI scan can allow quicker and more accurate diagnosis. They provided a fascinating overview of UCSF’s classification system built with 3D convolutional neural networks using the BigDL framework and Apache Spark on Intel Xeon processor-based systems.

Strata Data Kevin and RadhikaIntel Data Center Group director of marketing Kevin Huiskes and Software and Services Group big data engineering director Radhika Rangarajan presented “Accelerating Analytics and AI from the Edge to the Cloud.” They described big data analytics challenges and opportunities facing organizations, and they showcased a comprehensive range of edge-to-cloud technologies and platforms from Intel. They also shared best practices for optimizing data, analytics and AI pipelines, enabling organizations break through the technical barriers for competing on data. 
 

Strata Data Intel DemoThe Intel exhibit featured a cool demo showing how Motorsports teams can use deep learning and virtual reality (VR) to enhance driver performance as well as the spectator experience. A video feed from the race track streams live to thousands of fans, and datacenter technology creates the content and drives the streaming. Intel® AI technology identifies cars on the track and changes how fans experience the event. 

Using a streaming device, fans can experience the thrill of a live race from anywhere in the world. Multiple camera feeds captured from the track go to the datacenter, where each car gets identified using AI. Telemetry data is pulled from each car using 5G wireless networking to create a personalized experience, powered by Intel. Fans can follow their favorite car throughout the race, and they can watch action replays as soon as something happens anywhere on the track. 

 

For more information on Intel’s software developer programs for AI, visit the Intel® AI Academy website. 

For researchers, data scientists, and deep-learning explorers who are ready to scale out deep learning algorithms on Apache Spark, sign up for free compute for BigDL.


Other names and brands may be claimed as the property of others.

For more complete information about compiler optimizations, see our Optimization Notice.