Developer Guide

Contents

Numeric Tables

Numeric tables are a fundamental component of in-memory numeric data processing.
Intel DAAL
supports heterogeneous and homogeneous numeric tables for dense and sparse data, as follows:
  • Heterogeneous, Array Of Structures (AOS)
  • Heterogeneous, Structure Of Arrays (SOA)
  • Heterogeneous, Apache Arrow immutable table
  • Homogeneous, dense
  • Homogeneous matrix, dense
  • Homogeneous symmetric matrix, packed
  • Homogeneous triangular matrix, packed
  • Homogeneous, sparse CSR
Use homogeneous numeric tables, that is, objects of the
HomogenNumericTable
class, and matrices, that is, objects of the
Matrix
,
PackedTriangularMatrix
, and
PackedSymmetricMatrix
classes, when all the features are of the same basic data type. Values of the features are laid out in memory as one contiguous block in the row-major order, that is, Observation 1, Observation 2, and so on. In
Intel DAAL
, Matrix is a homogeneous numeric table most suitable for matrix algebra operations.
For triangular and symmetric matrices with reduced memory footprint, special classes are available:
PackedTriangularMatrix
and
PackedSymmetricMatrix
. Use the
DataLayout
enumeration to choose between representations of triangular and symmetric matrices:
  • Lower packed:
    lowerPackedSymetricMatrix
    or
    lowerPackedTriangularMatrix
  • Upper packed:
    upperPackedTriangularMatrix
    or
    upperPackedSymetricMatrix
Packed storage format
Heterogeneous numeric tables enable you to deal with data structures that are of different data types by nature.
Intel DAAL
provides two ways to represent non-homogeneous numeric tables: AOS and SOA.
AOS Numeric Table provides access to observations (feature vectors) that are laid out in a contiguous memory block:
Intel(R) DAAL AOS layout
Example
s
SOA Numeric Table provides access to data sets where observations for each feature are laid out contiguously in memory:
Intel(R) DAAL SOA layout
Example
s
Apache Arrow immutable numeric table provides access to immutable data sets stored in the Apache Arrow format. Memory layout is similar to SOA numeric table, but every feature can be stored in more than one contiguous piece of memory. See https://arrow.apache.org/docs/format/Layout.html for details.
For now,
Intel DAAL
supports only immutable tables, so this kind of numeric tables can be passed as the input for a particular algorithm, but cannot be passed for the output. Also, this kind of numeric tables is available for C++ only.
Samples
C++:
datastructures_arrow.cpp
The optimal data layout for homogeneous and heterogeneous numeric tables highly depends on a particular algorithm. You can find algorithm-specific guidance in the Performance Considerations section for the appropriate algorithm.
Intel DAAL
offers the
CSRNumericTable
class for a special version of a homogeneous numeric table that encodes sparse data, that is, the data with a significant number of zero elements. The library uses the Condensed Sparse Row (CSR) format for encoding:
Zero-based CSR
One-based CSR
Three arrays describe the sparse matrix
M
as follows:
  • The array
    values
    contains non-zero elements of the matrix row-by-row.
  • The
    j
    -th element of the array
    columns
    encodes the column index in the matrix
    M
    for the
    j
    -th element of the array
    values
    .
  • The
    i
    -th element of the array
    rowIndex
    encodes the index in the array
    values
    corresponding to the first non-zero element in rows indexed
    i
    or greater. The last element in the array
    rowIndex
    encodes the number of non-zero elements in the matrix
    M
    .
The library supports 1-based CSR encoding only. In C++ you can specify it by providing
oneBased
value through the
indexing
parameter of type
CSRIndexing
in the constructor of
CSRNumericTable
.
Example
s
Intel DAAL
offers the
MergedNumericTable
class for tables that provides access to data sets comprising several logical components, such as a set of feature vectors and corresponding labels. This type of tables enables you to read those data components from one data source. This special type of numeric tables can hold several numeric tables of any type but
CSRNumericTable
. In a merged numeric table, arrays are joined by columns and therefore can have different numbers of columns. In the case of different numbers of rows in input matrices, the number of rows in a merged table equals
min
(
r
1
,
r
2
, ...,
r
m
), where
r
i
is the number of rows in the
i
-th matrix,
i
= 1, 2, 3, ...,
m
.
Merged Numeric Table
Example
s

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804