Z-score
Z-score normalization is an algorithm that produces data with each feature (column) having zero mean and unit variance.
Details
Given a set
of dimension
of dimension
as following:
X
of n
feature vectors
p
, the problem is to compute the matrix
where:
is the mean of
j-th component of set, where
- value of
depends omn a computation mode
oneDAL provides two modes for computing the result matrix.
You can enable the mode by setting the flag
doScale
to a certain position (for details, see Algorithm Parameters).
The mode may include:- Centering only.In this case,
and no scaling is performed. After normalization, the mean of
j-th component of result setwill be zero.
- Centering and scaling.In this case,
, where
is the standard deviation of
j-th component of set. After normalization, the mean of
j-th component of result setwill be zero and its variance will get a value of one.
Some algorithms require normalization parameters (mean and variance) as an input.
The implementation of Z-score algorithm in oneDAL does not return these values by default.
Enable this option by setting the resultsToCompute flag.
For details, see Algorithm Parameters.
Batch Processing
Algorithm Input
Z-score normalization algorithm accepts an input as described below.
Pass the
Input ID
as a parameter to the methods that provide input for your algorithm.
For more details, see Algorithms.Input ID | Input |
---|---|
data | Pointer to the numeric table of size
This table can be an object of any class derived from NumericTable . |
Algorithm Parameters
Z-score normalization algorithm has the following parameters.
Some of them are required only for specific values of the computation method parameter
method
:Parameter | method | Default Value | Description |
---|---|---|---|
algorithmFPType | defaultDense or sumDense | float | The floating-point type that the algorithm uses for intermediate computations. Can be float or double . |
method | Not applicable | defaultDense | Available computation methods:
|
moments | defaultDense | SharedPtr<low_order_moments::Batch<algorithmFPType, low_order_moments::defaultDense> > | Pointer to the low order moments algorithm that computes means and standard deviations
to be used for Z-score normalization with the defaultDense method. |
doScale | defaultDense or sumDense | true | If true, the algorithm applies both centering and scaling.
Otherwise, the algorithm provides only centering. |
resultsToCompute | defaultDense or sumDense | Not applicable | Optional .Pointer to the data collection containing the following key-value pairs for Z-score:
Provide one of these values to request a single characteristic or use bitwise OR to request a combination of them. |
Algorithm Output
Z-score normalization algorithm calculates the result as described below.
Pass the
Result ID
as a parameter to the methods that access the results of your algorithm.
For more details, see Algorithms.Result ID | Result |
---|---|
normalizedData | Pointer to the
By default, the result is an object of the HomogenNumericTable class,
but you can define the result as an object of any class derived from NumericTable
except PackedTriangularMatrix , PackedSymmetricMatrix , and CSRNumericTable . |
means | Optional .Pointer to the
If the function result is not requested through the resultsToCompute parameter,
the numeric table contains a NULL pointer. |
variances | Optional .Pointer to the
If the function result is not requested through the resultsToCompute parameter,
the numeric table contains a NULL pointer. - |
By default, each numeric table specified by the collection elements is an object of the
HomogenNumericTable
class.
You can also define the result as an object of any class derived from NumericTable
,
except for PackedSymmetricMatrix
, PackedTriangularMatrix
, and CSRNumericTable
.Examples
C++ (CPU)
Batch Processing:
Java*
Python*
Batch Processing: