Developer Guide and Reference

  • 2021.1
  • 12/04/2020
  • Public Content
Contents

k-Nearest Neighbors Classification (k-NN)

k
-NN classification algorithm infers the class for the new feature vector by computing majority vote of the
k
nearest observations from the training set.

Mathematical formulation

Training
Let LaTex Math image. be the training set of
p
-dimensional feature vectors, let LaTex Math image. be the set of class labels, where LaTex Math image. , LaTex Math image. . Given
X
,
Y
and the number of nearest neighbors
k
, the problem is to build a model that allows distance computation between the feature vectors in training and inference sets at the inference stage.
Training method:
brute-force
The training operation produces the model that stores all the feature vectors from the initial training set
X
.
Training method:
k-d tree
The training operation builds a
k
-
d
tree that partitions the training set
X
(for more details, see k-d Tree).
Inference
Let LaTex Math image. be the inference set of
p
-dimensional feature vectors. Given
X’
, the model produced at the training stage and the number of nearest neighbors
k
, the problem is to predict the label LaTex Math image. for each LaTex Math image. , LaTex Math image. , by performing the following steps:
  1. Identify the set LaTex Math image. of the
    k
    feature vectors in the training set that are nearest to LaTex Math image. with respect to the Euclidean distance.
  2. Estimate the conditional probability for the
    l
    -th class as the fraction of vectors in LaTex Math image. whose labels LaTex Math image. are equal to
    l
    :
    LaTex Math image.
  3. Predict the class that has the highest probability for the feature vector LaTex Math image. :
    LaTex Math image.
Inference method:
brute-force
Brute-force inference method determines the set LaTex Math image. of the nearest feature vectors by iterating over all the pairs LaTex Math image. in the implementation defined order, LaTex Math image. , LaTex Math image. . The final prediction is computed according to the equations (1) and (2).
Inference method:
k-d tree
K-d tree inference method traverses the
k
-
d
tree to find feature vectors associated with a leaf node that are closest to LaTex Math image. , LaTex Math image. . The set LaTex Math image. of the currently-known nearest
k
-th neighbors is progressively updated during tree traversal. The search algorithm limits exploration of the nodes for which the distance between the LaTex Math image. and respective part of the feature space is not less than the distance between LaTex Math image. and the most distant feature vector from LaTex Math image. . Once tree traversal is finished, LaTex Math image. . The final prediction is computed according to the equations (1) and (2).

Programming Interface

All types and functions in this section are declared in the
oneapi::dal::knn
namespace and be available via inclusion of the
oneapi/dal/algo/knn.hpp
header file.
Descriptor
template<typename
Float
= detail::descriptor_base<>::float_t, typename
Method
= detail::descriptor_base<>::method_t, typename
Task
= detail::descriptor_base<>::task_t>
class
descriptor
Template Parameters
  • Float
    – The floating-point type that the algorithm uses for intermediate computations. Can be
    float
    or
    double
    .
  • Method
    – Tag-type that specifies an implementation of algorithm. Can be
    method::bruteforce
    or
    method::kd_tree
    .
  • Task
    – Tag-type that specifies type of the problem to solve. Can be
    task::classification
    .
Constructors
descriptor
(std::int64_t
class_count
, std::int64_t
neighbor_count
)
Creates a new instance of the class with the given
class_count
and
neighbor_count
property values.
Public Methods
auto &
set_class_count
(std::int64_t
value
)
auto &
set_neighbor_count
(std::int64_t
value
)
Method tags
struct
brute_force
Tag-type that denotes brute-force computational method.
struct
kd_tree
Tag-type that denotes k-d tree computational method.
using
by_default
= brute_force
Alias tag-type for brute-force computational method.
Task tags
struct
classification
Tag-type that parameterizes entities used for solving classification problem.
using
by_default
= classification
Alias tag-type for classification task.
Model
template<typename
Task
= task::by_default>
class
model
Template Parameters
Task
– Tag-type that specifies type of the problem to solve. Can be
task::classification
.
Constructors
model
()
Creates a new instance of the class with the default property values.
Training
train(...)
Input
template<typename
Task
= task::by_default>
class
train_input
Template Parameters
Task
– Tag-type that specifies type of the problem to solve. Can be
task::classification
.
Constructors
train_input
(
const
table &
data
,
const
table &
labels
)
Creates a new instance of the class with the given
data
and
labels
property values.
Properties
const
table &
data
= table{}
The training set LaTex Math image. .
Getter & Setter


const table & get_data() const
auto & set_data(const table &data)

const
table &
labels
= table{}
Vector of labels LaTex Math image. for the training set LaTex Math image. .
Getter & Setter


const table & get_labels() const
auto & set_labels(const table &labels)

Result
template<typename
Task
= task::by_default>
class
train_result
Template Parameters
Task
– Tag-type that specifies type of the problem to solve. Can be
task::classification
.
Constructors
train_result
()
Creates a new instance of the class with the default property values.
Properties
const
model<Task> &
model
= model<Task>{}
The trained LaTex Math image. -NN model.
Getter & Setter


const model< Task > & get_model() const
auto & set_model(const model< Task > &value)

Inference
infer(...)
Input
template<typename
Task
= task::by_default>
class
infer_input
Template Parameters
Task
– Tag-type that specifies type of the problem to solve. Can be
task::classification
.
Constructors
infer_input
(
const
table &
data
,
const
model<Task> &
model
)
Creates a new instance of the class with the given
model
and
data
property values.
Properties
const
table &
data
= table{}
The dataset for inference LaTex Math image. .
Getter & Setter


const table & get_data() const
auto & set_data(const table &data)

const
model<Task> &
model
= model<Task>{}
The trained LaTex Math image. -NN model.
Getter & Setter


const model< Task > & get_model() const
auto & set_model(const model< Task > &m)

Result
template<typename
Task
= task::by_default>
class
infer_result
Template Parameters
Task
– Tag-type that specifies type of the problem to solve. Can be
task::classification
.
Constructors
infer_result
()
Creates a new instance of the class with the default property values.
Properties
const
table &
labels
= table{}
The predicted labels.
Getter & Setter


const table & get_labels() const
auto & set_labels(const table &value)

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.