Summary Statistics
The Summary Statistics domain provides routines that
compute basic statistical estimates for single and double precision
multi-dimensional datasets.
The Summary Statistics routines calculate:
- raw and central moments up to the fourth order
- skewness and excess kurtosis (further referred to askurtosisfor brevity)
- variation coefficient
- quantiles and order statistics
- minimum and maximum
- variance-covariance/correlation matrix
- pooled/group variance-covariance matrix and mean
- partial variance-covariance/correlation matrix
- robust estimators for variance-covariance matrix and mean in presence of outliers
- raw/central partial sums up to the fourth order (for brevity referred to asraw/central sums)
- matrix of cross-products and sums of squares (for brevity referred to ascross-product matrix)
- median absolute deviation, mean absolute deviation
The library also contains functions to perform the
following tasks:
- Detect outliers in datasets
- Support missing values in datasets
- Parameterize correlation matrices
- Compute quantiles for streaming data
Mathematical Notation and
Definitions defines the supported
operations in the Summary Statistics routines.
You can access the Summary Statistics routines
through the Fortran 90 and C89 language interfaces.
You can use the
C89 interface with later versions of the C/C++.
The
header file is in the
mkl_vsl.h
${MKL}/include
directory.
You can find examples that demonstrate calculation of
the Summary Statistics estimates in the
${MKL}/examples/vslc
example directory.
The Summary Statistics API is implemented through
task objects, or tasks. A task object is a data structure, or a descriptor,
holding parameters that determine a specific Summary Statistics operation. For
example, such parameters may be precision, dimensions of user data, the matrix
of the observations, or shapes of data arrays.
All the Summary Statistics routines process a task
object as follows:
- Create a task.
- Modify settings of the task parameters.
- Compute statistical estimates.
- Destroy the task.
The Summary Statistics functions fall into the
following categories:
Task Constructors - routines
that create a new task object descriptor and set up most common parameters
(dimension, number of observations, and matrix of the observations).
Task Editors - routines that
can set or modify some parameter settings in the existing task descriptor.
Task Computation Routine - a
routine that computes specified statistical estimates.
Task Destructor - a routine
that deletes the task object and frees the memory.
A Summary Statistics task object contains a series
of pointers to the input and output data arrays. You can read and modify the
datasets and estimates at any time but you should allocate and release memory
for such data.
See detailed information on the algorithms, API, and
their usage in the
Summary Statistics Application Notes [SS Notes].
Intel® oneAPI Math Kernel Library