Numeric Tables

Numeric tables are a fundamental component of in-memory numeric data processing. Intel DAAL supports heterogeneous and homogeneous numeric tables for dense and sparse data, as follows:

  • Heterogeneous, Array Of Structures (AOS)
  • Heterogeneous, Structure Of Arrays (SOA)
  • Homogeneous, dense
  • Homogeneous matrix, dense
  • Homogeneous symmetric matrix, packed
  • Homogeneous triangular matrix, packed
  • Homogeneous, sparse CSR

Use homogeneous numeric tables, that is, objects of the HomogenNumericTable class, and matrices, that is, objects of the Matrix, PackedTriangularMatrix, and PackedSymmetricMatrix classes, when all the features are of the same basic data type. Values of the features are laid out in memory as one contiguous block in the row-major order, that is, Observation 1, Observation 2, and so on. In Intel DAAL, Matrix is a homogeneous numeric table most suitable for matrix algebra operations.

For triangular and symmetric matrices with reduced memory footprint, special classes are available: PackedTriangularMatrix and PackedSymmetricMatrix. Use the DataLayout enumeration to choose between representations of triangular and symmetric matrices:

  • Lower packed: lowerPackedSymetricMatrix or lowerPackedTriangularMatrix
  • Upper packed: upperPackedTriangularMatrix or upperPackedSymetricMatrix


Packed storage format

Heterogeneous numeric tables enable you to deal with data structures that are of different data types by nature. Intel DAAL provides two ways to represent non-homogeneous numeric tables: AOS and SOA.

AOS Numeric Table provides access to observations (feature vectors) that are laid out in a contiguous memory block:
Intel® DAAL AOS layout

Examples

C++: datastructures_aos.cpp

Java*: DataStructuresAOS.java

Python*: datastructures_aos.py

SOA Numeric Table provides access to data sets where observations for each feature are laid out contiguously in memory:
Intel® DAAL SOA layout

Examples

C++: datastructures_soa.cpp

Java*: DataStructuresSOA.java

Python*: datastructures_soa.py

The optimal data layout for homogeneous and heterogeneous numeric tables highly depends on a particular algorithm. You can find algorithm-specific guidance in the Performance Considerations section for the appropriate algorithm.

Intel DAAL offers the CSRNumericTable class for a special version of a homogeneous numeric table that encodes sparse data, that is, the data with a significant number of zero elements. The library uses the Condensed Sparse Row (CSR) format for encoding:
Zero-based CSR

One-based CSR

Three arrays describe the sparse matrix M as follows:

  • The array values contains non-zero elements of the matrix row-by-row.
  • The j-th element of the array columns encodes the column index in the matrix M for the j-th element of the array values.
  • The i-th element of the array rowIndex encodes the index in the array values corresponding to the first non-zero element in rows indexed i or greater. The last element in the array rowIndex encodes the number of non-zero elements in the matrix M.

The library supports 1-based CSR encoding only. In C++ you can specify it by providingoneBasedvalue through the indexing parameter of type CSRIndexing in the constructor of CSRNumericTable.

Examples

C++: datastructures_csr.cpp

Java*: DataStructuresCSR.java

Python*: datastructures_csr.py

Intel DAAL offers the MergedNumericTable class for tables that provide access to data sets comprising several logical components, such as a set of feature vectors and corresponding labels. This type of tables enables you to read those data components from one data source. This special type of numeric tables can hold several numeric tables of any type but CSRNumericTable. In a merged numeric table, arrays are joined by columns and therefore can have different numbers of columns. In the case of different numbers of rows in input matrices, the number of rows in a merged table equals min (r1, r2, ..., rm), where ri is the number of rows in the i-th matrix, i = 1, 2, 3, ..., m.


Merged Numeric Table

Examples

C++: datastructures_merged.cpp

Java*: DataStructuresMerged.java

Python*: datastructures_merged.py

For more complete information about compiler optimizations, see our Optimization Notice.