Machine Learning Introduction: Regression and Classification

Overview

This video examines two of the main problems with machine learning, regression, and classification.

Regression is a combination of multidimensional fitting and function interpolation. With the regression problem, you are trying to find function approximation with the minimal error deviation or cost function.

Classification identifies group membership. That means that if you have multiple events characterized by input parameters which can be labeled differently, and you want your system to predict which label should be used – this is the classical classification problem.

Sign Up for Exclusive
AI Information

Subscribe

Transcript

Hello, I'm Vadim Karpusenko, a developer evangelist here at Intel. This introduction to machine learning will cover two problem types-- regression and classification-- explain scoring, cost functions and training, and discuss the basics of supervised, unsupervised, and reinforcement learning. 

Let's begin with two main problems machine learning is trying to solve-- regression and classification. Mathematically speaking, regression is a combination of multi-dimensional feeding and function interpolation. With a regression problem, you're trying to find function approximation with a minimal error deviation-- or cost function. In other words, regression is simply trying to predict numeric dependency-- a function value-- for instance, price of a house-- from a set of input parameters like square footage, age, location, number of bedrooms and bathrooms, and so on. 

Classification is a different problem type. It identifies group membership. That means that if you have multiple events characterized by input parameters, which can be a labelled differently, and you want your system to predict which label should be used, this is the classical classification problem. 

Take spam filters, for example. Emails in your inbox are processed by the machine learning-spending algorithm. And if some criteria is met, emails are labelled as spam. 

Machine learning is a fascinating topic as it incorporates substantial parts of different fields-- statistics, artificial intelligence theory, data analytics, and numerical methods. Simply put, machine learning is an application that is capable of improving its prediction results with successive iterations. 

Or you could say that it improves with experience. The process of an application improving with experience is, naturally enough, called Training. It can take significant iterations to gradually improve their results. 

During the process of training, data is given to a machine-learning algorithm, which then refines its internal representation, numerical parameters, as it encounters any deviations or Training errors. The purpose of this stage is to minimize cost function, error function, or maximize likelihood by adjusting algorithms' internal weights. 

When the algorithm accuracy performance improves, we call this learning. Once the results are accurate enough, the machine-learning application can be deployed to solve the problem it was designed to address. This is the final step of machine learning and it's called scoring. 

At this point, the algorithm is already trained with the training data set and is ready to operate on a new, previously unseen data. The process of calculation of regression function value, or the most probable label assignment, is called scoring-- or prediction. In supervised learning case, Training data set is pre-labelled for classification problem or function values are known in case of regression. 

After Training is done and model has minimum cost function for the Training data set, we switch to the scoring, where we label or predict regression function values for new data. In unsupervised learning situation, where algorithm detects data features automatically, this depends on the purpose of the algorithm as well as the assumptions made on what the properties and observed values are, clustering and filtering our typical unsupervised learning tasks. 

In reinforcement learning situations, the costs are dependent on the environment and the situation required within that environment. In the end, the policy is required to determine the cost. Deep learning algorithms calculate a cumulative error or cost function that the model makes for all examples in every step of the Training algorithm. The cumulative error is used in optimization to fix the next set of parameters and the process continues. 

Deep learning networks can be trained in an unsupervised or supervised manner for both unsupervised and supervised learning tasks. In theory, it should give you better scoring results due to automatic feature detection. 

The scoring is what gives your algorithm life to the world. It returns movie recommendations, filters photo searches, allows robots to maneuver on unfamiliar terrain. You train machine-learning algorithm models in classification, regression, and prediction, and, in the end, it might even mimic the work and complexity of the human mind. 

For more overviews and technical videos about machine learning, refer to Intel's Machine Learning Zone. Thanks for watching. ello, I'm [? Ladimka ?] [? Prosenka, ?] a developer evangelist here at Intel. This introduction to machine learning will cover two problem types-- regression and classification-- explain scoring, cost functions and training, and discuss the basics of supervised, unsupervised, and reinforcement learning. 

Let's begin with two main problems machine learning is trying to solve-- regression and classification. Mathematically speaking, regression is a combination of multi-dimensional feeding and function interpolation. With a regression problem, you're trying to find function approximation with a minimal error deviation-- or cost function. In other words, regression is simply trying to predict numeric dependency-- a function value-- for instance, price of a house-- from a set of input parameters like square footage, age, location, number of bedrooms and bathrooms, and so on. 

Classification is a different problem type. It identifies group membership. That means that if you have multiple events characterized by input parameters, which can be a labelled differently, and you want your system to predict which label should be used, this is the classical classification problem. 

Take spam filters, for example. Emails in your inbox are processed by the machine learning-spending algorithm. And if some criteria is met, emails are labelled as spam. 

Machine learning is a fascinating topic as it incorporates substantial parts of different fields-- statistics, artificial intelligence theory, data analytics, and numerical methods. Simply put, machine learning is an application that is capable of improving its prediction results with successive iterations. 

Or you could say that it improves with experience. The process of an application improving with experience is, naturally enough, called Training. It can take significant iterations to gradually improve their results. 

During the process of training, data is given to a machine-learning algorithm, which then refines its internal representation, numerical parameters, as it encounters any deviations or Training errors. The purpose of this stage is to minimize cost function, error function, or maximize likelihood by adjusting algorithms' internal weights. 

When the algorithm accuracy performance improves, we call this learning. Once the results are accurate enough, the machine-learning application can be deployed to solve the problem it was designed to address. This is the final step of machine learning and it's called scoring. 

At this point, the algorithm is already trained with the training data set and is ready to operate on a new, previously unseen data. The process of calculation of regression function value, or the most probable label assignment, is called scoring-- or prediction. In supervised learning case, Training data set is pre-labelled for classification problem or function values are known in case of regression. 

After Training is done and model has minimum cost function for the Training data set, we switch to the scoring, where we label or predict regression function values for new data. In unsupervised learning situation, where algorithm detects data features automatically, this depends on the purpose of the algorithm as well as the assumptions made on what the properties and observed values are, clustering and filtering our typical unsupervised learning tasks. 

In reinforcement learning situations, the costs are dependent on the environment and the situation required within that environment. In the end, the policy is required to determine the cost. Deep learning algorithms calculate a cumulative error or cost function that the model makes for all examples in every step of the Training algorithm. The cumulative error is used in optimization to fix the next set of parameters and the process continues. 

Deep learning networks can be trained in an unsupervised or supervised manner for both unsupervised and supervised learning tasks. In theory, it should give you better scoring results due to automatic feature detection. 

The scoring is what gives your algorithm life to the world. It returns movie recommendations, filters photo searches, allows robots to maneuver on unfamiliar terrain. You train machine-learning algorithm models in classification, regression, and prediction, and, in the end, it might even mimic the work and complexity of the human mind. 

For more overviews and technical videos about machine learning, refer to Intel's Machine Learning Zone. Thanks for watching. 

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804