Co-author: Suqiang Song of Mastercard*.
This article introduces a joint initiative between Mastercard* and Intel in building users-items propensity models for a universal recommender AI service. Analytic Zoo1 is a unified analytics and AI platform that seamlessly unites Apache Spark*, TensorFlow*, Keras*, and BigDL2 programs into an integrated pipeline that can transparently scale out to large Apache Hadoop* and Spark clusters for distributed training or inference.
In the finance industry, users-items propensity can be used to calculate the probability of consumers to buy from a particular merchant or a retailer within a given industry. This model can be used to generate market research insights or to deliver personalized recommendations of relevant financial products or merchant deals. Using the deep learning-based neural recommendation models built on Spark, the recommender system can play an essential role in improving the consumer experience, campaign performance, and accuracy of targeted marketing offers/programs with relevant messages that encourage loyalty and rewards. This article uses a personalized marketing business use case as the running example, and focuses on predicting users-items propensity from formatted credit card transactions:
Mastercard, as a leading global provider of payment solutions, is integrating artificial intelligence (AI) into its platform to serve its customers better. Running Analytic Zoo, which supports BigDL on Spark running on large Intel® Xeon® Scalable processor clusters, is an ideal solution that meets enterprise requirements for deep learning, as it allows users to develop and run deep learning applications in production directly on existing big data (Apache Hadoop/Spark) infrastructure. In contrast, there are many challenges to deploying GPU-based solutions in enterprises (e.g., bad tool integration, expensive data duplication and movement, time-consuming and engineer-resource intensive, less monitoring, steep learning curve, etc.), as they are incompatible with existing data analytics infrastructure.
Deep learning can play an important role in driving a higher ROI through marketing campaign effectiveness. For this reason, greater emphasis is placed on sharper insights into consumer behavior to connect with customers according to their interests and preferences. For instance, an offer from a merchant is most effective if it can be sent to consumers with the highest purchase potential. Conventional machine learning algorithms played a vital role in previous solutions. However, the industry is seeking a more robust solution with simplified procedures to handle model complexity, labor-intensive feature engineering processes, and greater accuracy. Recently, many deep learning-based neural recommendation models are being proposed to improve the effectiveness of marketing campaigns further.
Recommender system (RS) is an information filtering tool for guiding users in a personalized way to discover their preferences from a large space of possible options. It is a critical tool to promote sales and services for many online websites and mobile applications. For instance, 80 percent of movies watched on Netflix* came from recommendations3, and 60 percent of video clicks came from home page recommendations on YouTube*4. Recent advances in deep learning-based recommender systems have gained significant attention by overcoming obstacles of conventional models and achieving high recommendation quality5.
Recommendation models can be classified into three categories: collaborative filtering, content-based, and hybrid systems. Collaborative filtering makes recommendations by learning from user-item historical interactions, either through explicit (e.g., user’s previous ratings) or implicit feedback (e.g., purchase histories). Due to data constraints, in this case, collaborative filtering is leveraged from implicit data.
As an integrated analytics and AI platform running natively on Spark, Analytic Zoo meets the standard requirements for enterprise deep learning applications.
Analyze a large amount of data on the same big data clusters where the data are stored (Hadoop Distributed File System (HFS), Apache HBase*, Apache Hive*, etc.) rather than move or duplicate data.
Add deep learning capabilities to existing analytic applications and machine learning workflows rather than rebuild all of them.
Leverage existing big data clusters and infrastructure (resource allocation, workloads management, and enterprise monitoring).
Reduce feature engineering workloads. Deep learning algorithms generate an exponential growth of hidden embedding features and perform the internal features selections and optimization automatically when performing cross-validation at the training stage. When building the model, algorithms only focus on a few pre-defined sliding features and custom overlap features, removing most of the loan-to-value (LTV) pre-calculations works, saving hours of time, and lots of resources.
Automated model optimization. The traditional machine learning (ML) approach relies heavily on human-machine learning experts to optimize the model. Analytic Zoo provides more options for finding an optimally performing robust configuration.
Zero deployment or operation costs since Analytic Zoo runs as a standard Spark program on Intel Xeon processors.
High-level pipeline API enablement, such as data frames, ML pipelines, autograd, transfer learning, Keras/Keras2, etc.
Considering Mastercard has run traditional machine learning for decades for similar models and has spent resources on the Spark ML ecosystem, such as Spark MLlib, the business stakeholders wanted to benchmark the two approaches and identify the differences. So, a benchmark test was conducted between traditional Spark machine learning and the BigDL models in Analytic Zoo.
Select data sets:
The data was collected from a specific channel over the past three years as the dataset.
Production Environment Hadoop Cluster:
For the traditional machine learning approach, an Alternating Least Squares (ALS)6 at Spark MLlib approach was chosen.
For the deep learning approach, based on the latest research and industry practice, a Neural Collaborative Filtering (NCF) and a wide and deep (WAD) model were chosen as the two candidates for the recommender. Keras-style APIs from Analytic Zoo were also used to build deep learning models with Python* and Scala*.
Figure 1. Compare Deep Learning models with ALS model
The simple, generic NCF model, first proposed by Xiangnan He7, is designed to serve as a guideline for developing deep learning methods for recommendation services, aiming to capture the non-linear relationship between users and items. As there are a large number of unobserved instances, NCF utilizes negative sampling to reduce the training data size, which significantly improves learning efficiency. Traditional matrix factorization can be viewed as a special case of NCF. With Analytic Zoo, users can easily build an NCF model as shown in the following graph.
In 2016, Heng-Tze Cheng8 proposed an app recommender system for the Google Play* store with a wide and deep (WAD) model. The wide component is a single-layer perceptron, which works as a generalized linear model. The deep component is multilayer perceptron similar to NCF. Combining these two learning techniques enables the recommender system to capture both memorization and generalization. For this case, merchant ID and other features were used to generate the cross columns for the wide model.
Figure 3. A Wide and Deep Model diagram
The WAD model used a SparseTensor, and quite a few layers explicitly designed for sparse data calculation, e.g., SparseLinear, SparseJoinTable, etc. Analytic Zoo supports both data frame and Resilient Distributed Dataset (RDD) interface for data preparation and training, providing flexibility for different scenarios and allowing compatibility across Spark 1.5 to the latest versions.
With the evaluation utilities from Spark MLlib ALS, the recommender implemented with NCF and WAD were measured with the following metrics.
To compare with traditional matrix factorization algorithms the same data and optimization parameters were also trained with ALS in Spark 2.2.0. Comparably, deep learning models bring significant improvements over the ALS model, as shown in the table below.
|NCF Model||WAD Model|
|Recall Improvement over ALS||29%||26%|
|Precision Improvement over ALS||18%||21%|
|Top 20 Accuracy Improvement over ALS||14%||16%|
The Analytic Zoo model can be seamlessly integrated into web services such as Spark Streaming, Kafka*, etc., by using Plain Old Java Object (POJO), local Java APIs, or Scala/Python model loading APIs.
Mastercard uses data pipeline framework, Apache NiFi9 to build the enterprise data pipeline platform. It developed relevant, customized processors to embed the deep learning and model serving process into existing enterprise data pipelines by leveraging the serving APIs from Analytic Zoo.
This article describes our experience in building a recommender AI service based on consumer transaction history using deep learning with Analytic Zoo, which provides a great solution to meet the deep learning requirements of enterprises. Two deep learning models (NCF, WAD) are developed and evaluated. Compared with traditional machine learning algorithms (such as LR or ALS), deep learning models can significantly improve the recommender quality and simplify the model training procedures. As an end-to-end industry example, we showed how to leverage deep learning with Analytic Zoo to build an excellent recommender system to help power a critical element of Mastercard’s marketing and personalization capabilities.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804