Developer Reference

  • 2020.2
  • 07/15/2020
  • Public Content
Contents

Summary Statistics

The Summary Statistics domain provides routines that compute basic statistical estimates for single and double precision multi-dimensional datasets.
The Summary Statistics routines calculate:
  • raw and central moments up to the fourth order
  • skewness and excess kurtosis (further referred to as
    kurtosis
    for brevity)
  • variation coefficient
  • quantiles and order statistics
  • minimum and maximum
  • variance-covariance/correlation matrix
  • pooled/group variance-covariance matrix and mean
  • partial variance-covariance/correlation matrix
  • robust estimators for variance-covariance matrix and mean in presence of outliers
  • raw/central partial sums up to the fourth order (for brevity referred to as
    raw/central sums
    )
  • matrix of cross-products and sums of squares (for brevity referred to as
    cross-product matrix
    )
  • median absolute deviation, mean absolute deviation
The library also contains functions to perform the following tasks:
  • Detect outliers in datasets
  • Support missing values in datasets
  • Parameterize correlation matrices
  • Compute quantiles for streaming data
Mathematical Notation and Definitions defines the supported operations in the Summary Statistics routines.
You can access the Summary Statistics routines through the Fortran 90 and C89 language interfaces.
You can use the Fortran 90 interface with programs written in Fortran 95.
The
mkl_vsl.f90
header file is in the
${MKL}/include
directory.
See more details about the Fortran header in Random Number Generators
topic
.
You can find examples that demonstrate calculation of the Summary Statistics estimates in the
${MKL}/examples/vslf
example directory.
The Summary Statistics API is implemented through task objects, or tasks. A task object is a data structure, or a descriptor, holding parameters that determine a specific Summary Statistics operation. For example, such parameters may be precision, dimensions of user data, the matrix of the observations, or shapes of data arrays.
All the Summary Statistics routines process a task object as follows:
  1. Create a task.
  2. Modify settings of the task parameters.
  3. Compute statistical estimates.
  4. Destroy the task.
The Summary Statistics functions fall into the following categories:
Task Constructors - routines that create a new task object descriptor and set up most common parameters (dimension, number of observations, and matrix of the observations).
Task Editors - routines that can set or modify some parameter settings in the existing task descriptor.
Task Computation Routine - a routine that computes specified statistical estimates.
Task Destructor - a routine that deletes the task object and frees the memory.
A Summary Statistics task object contains a series of pointers to the input and output data arrays. You can read and modify the datasets and estimates at any time but you should allocate and release memory for such data.
See detailed information on the algorithms, API, and their usage in the
Intel® MKL
Summary Statistics Application Notes
[ SS Notes ].

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804