Developer Guide

Contents

Details

The library provides Multinomial Naïve Bayes classifier [Renie03].
Let
J
be the number of classes, indexed 0,1,…,
J
-1. The integer-valued feature vector
x
i
= (
x
i
1
,…,
x
ip
),
i
=1,…,
n
, contains scaled frequencies: the value of
x
ik
is the number of times the
k
-th feature is observed in the vector
x
i
(in terms of the document classification problem,
x
ik
is the number of occurrences of the word indexed
k
in the document
x
i
). For a given data set (a set of
n
documents), (
x
1
,…,
x
n
), the problem is to train a Naïve Bayes classifier.

Training Stage

The Training stage involves calculation of these parameters:
  • where
    N
    jk
    is the number of occurrences of the feature
    k
    in the class
    j
    ,
    N
    j
    is the total number of occurrences of all features in the class, the
    α
    k
    parameter is the imagined number of occurrences of the feature
    k
    (for example,
    α
    k
    =1), and
    α
    is the sum of all
    α
    k
    .
  • log(
    p
    (
    θ
    j
    )), where
    p
    (
    θ
    j
    ) is the prior class estimate.

Prediction Stage

Given a new feature vector
x
i
, the classifier determines the class the vector belongs to:

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804