Distributed Processing
Step 1  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

blockIndex 
Not applicable

Unique identifier of block initially passed for computation on the local node.

nBlocks 
Not applicable

Number of blocks initially passed for computation on all nodes.

Input ID

Input


step1Data 
Pointer to the
n
x
p
numeric table with the observations to be clustered. The input can be an object of any class derived from
NumericTable
.

Partial Result ID

Result


partialOrder 
Pointer to the
n
x 2 numeric table containing information about observations: identifier of initial block and index in initial block. This information will be required to reconstruct initial blocks after transferring observations among nodes.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 2  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

blockIndex 
Not applicable

Unique identifier of block initially passed for computation on the local node.

nBlocks 
Not applicable

Number of blocks initially passed for computation on all nodes.

Input ID

Input


partialData 
Pointer to the collection of numeric tables with
p
columns and arbitrary number of rows, containing observations to be clustered. The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
.

Partial Result ID

Result


boundingBox 
Pointer to the 2 x
p
numeric table containing bounding box of input observations: first row contains minimum value of each feature, second row contains maximum value of each feature.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 3  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

leftBlocks 
Not applicable

Number of blocks that will process observations with value of selected split feature smaller than selected split value.

rightBlocks 
Not applicable

Number of blocks that will process observations with value of selected split feature greater than selected split value.

Input ID

Input


partialData 
Pointer to the collection of numeric tables with
p
columns and arbitrary number of rows, containing observations to be clustered. The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
.

step3PartialBoundingBoxes 
Pointer to the collection of the 2 x
p
numeric tables containing bounding boxes computed on step 2 and collected from all nodes participating in current iteration of geometric repartitioning process. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.

Partial Result ID

Result


split 
Pointer to the 1 x 2 numeric table containing information about split for current iteration of geometric repartitioning.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 4  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

leftBlocks 
Not applicable

Number of blocks that will process observations with value of selected split feature smaller than selected split value.

rightBlocks 
Not applicable

Number of blocks that will process observations with value of selected split feature greater than selected split value.

Input ID

Input


partialData 
Pointer to the collection of numeric tables with
p
columns and arbitrary number of rows, containing observations to be clustered.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
.

step4PartialOrders 
Pointer to the collection of numeric table with 2 columns and arbitrary number of rows containing information about observations: identifier of initial block and index in initial block.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.

step4PartialSplits 
Pointer to the collection of the 1 x 2 numeric table containing information about split computed on step 3 and collected from all nodes participating in current iteration of geometric repartitioning process.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.

Partial Result ID

Result


partitionedData 
Pointer to the collection of (leftBlocks + rightBlocks) numeric tables with
p
columns and arbitrary number of rows containing observations for processing on nodes participating in current iteration of geometric repartitioning. First leftBlocks numeric tables in collection have the value of selected split feature smaller than selected split value. Next rightBlocks numeric tables in collection have the value of selected split feature larger than selected split value.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 5  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

blockIndex 
Not applicable

Unique identifier of block initially passed for computation on the local node.

nBlocks 
Not applicable

Number of blocks initially passed for computation on all nodes.

epsilon 
Not applicable

The maximum distance between observations lying in the same neighborhood.

Input ID

Input


partialData 
Pointer to the collection of numeric tables with
p
columns and arbitrary number of rows, containing observations to be clustered.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
.

step5PartialBoundingBoxes 
Pointer to the collection of 2 x
p
numeric table containing bounding boxes computed on step 2 and collected from all nodes. Numeric tables in collection should be ordered by the identifiers of initial block of nodes.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.

Partial Result ID

Result


partitionedHaloData 
Pointer to the collection of nBlocks numeric tables with
p
columns and arbitrary number of rows containing observations from current node that should be used as halo observations on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection
class. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
partitionedHaloDataIndices 
Pointer to the collection of nBlocks numeric tables with 1 column and arbitrary number of rows containing indices of observations from current node that should be used as halo observations on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection
class. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 6  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

blockIndex 
Not applicable

Unique identifier of block initially passed for computation on the local node.

nBlocks 
Not applicable

Number of blocks initially passed for computation on all nodes.

epsilon 
Not applicable

The maximum distance between observations lying in the same neighborhood.

minObservations 
Not applicable

The number of observations in a neighborhood for an observation to be considered as a core.

memorySavingMode 
false 
If flag is set to false, all neighborhoods will be computed and stored prior to clustering. It will require up to O(sum of sizes of neighborhoods) additional memory, which in worst case can be O(number of observations^2), but in common case performance may be better.

Input ID

Input


partialData 
Pointer to the collection of numeric tables with
p
columns and arbitrary number of rows, containing observations to be clustered.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
.

haloData 
Pointer to the collection of numeric tables with p columns and arbitrary number of rows, containing halo observations for current node computed on step 5.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
.

haloDataIndices 
Pointer to the collection of numeric tables with 1 column and arbitrary number of rows, containing indices for halo observations for current node computed on step 5. Size of this collection should be equal to the size of collection for haloData Input ID.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.

haloDataBlocks 
Pointer to the collection of 1 x 1 numeric tables containing identifiers of initial block for halo observations for current node computed on step 5. Size of this collection should be equal to the size of collection for haloData Input ID.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.

Partial Result ID

Result


step6ClusterStructure 
Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step6FinishedFlag 
Pointer to 1 x 1 numeric table containing the flag indicating that the clustering process is finished for current node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step6NClusters 
Pointer to 1 x 1 numeric table containing the current number of clusters found on the local node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step6Queries 
Pointer to the collection of nBlocks numeric tables with 3 columns and arbitrary number of rows containing clustering queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection
class. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 7  on Master Node
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

Input ID

Input


partialFinishedFlags 
Pointer to the collection of 1 x 1 numeric table containing the flag indicating that the clustering process is finished collected from all nodes.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Partial Result ID

Result


finishedFlag 
Pointer to 1 x 1 numeric table containing the flag indicating that the clustering process is finished on all nodes.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 8  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

blockIndex 
Not applicable

Unique identifier of block initially passed for computation on the local node.

nBlocks 
Not applicable

Number of blocks initially passed for computation on all nodes.

Input ID

Input


step8InputClusterStructure 
Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
The input can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step8InputNClusters 
Pointer to 1 x 1 numeric tables containing the current number of clusters found on the local node.
The input can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step8PartialQueries 
Pointer to the collection of numeric tables with 3 columns and arbitrary number of rows containing clustering queries that should be processed on the local node collected from all nodes.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Partial Result ID

Result


step8ClusterStructure 
Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step8FinishedFlag 
Pointer to 1 x 1 numeric table containing the flag indicating that the clustering process is finished for current node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step8NClusters 
Pointer to 1 x 1 numeric table containing the current number of clusters found on the local node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step8Queries 
Pointer to the collection of nBlocks numeric tables with 3 columns and arbitrary number of rows containing clustering queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection
class. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 9  on Master Node
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

Input ID

Input


partialNClusters 
Pointer to the collection of 1 x 1 numeric table containing the number of clusters found on each node.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Result ID

Result


step9NClusters 
Pointer to 1 x 1 numeric table containing the flag indicating that the clustering process is finished on all nodes.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Partial Result ID

Result


clusterOffsets 
Pointer to the collection of 1 x 1 numeric tables containing offsets for cluster numeration for each node. Numeric tables with offsets are given in the same order as in the collection for partialNClusters Input ID.
By default, this result is an object of the
DataCollection
class. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 10  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

blockIndex 
Not applicable

Unique identifier of block initially passed for computation on the local node.

nBlocks 
Not applicable

Number of blocks initially passed for computation on all nodes.

Input ID

Input


step10InputClusterStructure 
Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
The input can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step10ClusterOffset 
Pointer to 1 x 1 numeric table containing the offset for cluster numeration on the local node computed on step 9.
The input can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Partial Result ID

Result


step10ClusterStructure 
Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step10FinishedFlag 
Pointer to 1 x 1 numeric table containing the flag indicating that the clusters numeration process is finished for current node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step10Queries 
Pointer to the collection of nBlocks numeric tables with 4 columns and arbitrary number of rows containing clusters numeration queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection
class. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 11  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

blockIndex 
Not applicable

Unique identifier of block initially passed for computation on the local node.

nBlocks 
Not applicable

Number of blocks initially passed for computation on all nodes.

Input ID

Input


step11InputClusterStructure 
Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
The input can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step11PartialQueries 
Pointer to the collection of numeric tables with 4 columns and arbitrary number of rows containing clusters numeration queries that should be processed on the local node collected from all nodes.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Partial Result ID

Result


step11ClusterStructure 
Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step11FinishedFlag 
Pointer to 1 x 1 numeric table containing the flag indicating that the clusters numeration process is finished for current node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step11Queries 
Pointer to the collection of nBlocks numeric tables with 4 columns and arbitrary number of rows containing clusters numeration queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection
class. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 12  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

blockIndex 
Not applicable

Unique identifier of block initially passed for computation on the local node.

nBlocks 
Not applicable

Number of blocks initially passed for computation on all nodes.

Input ID

Input


step12InputClusterStructure 
Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
The input can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
step12PartialOrders 
Pointer to the collection of
n
x 2 numeric tables containing information about observations: identifier of initial block and index in initial block. This information will be required to reconstruct initial blocks after transferring observations among nodes.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Partial Result ID

Result


assignmentQueries 
Pointer to the collection of nBlocks numeric tables with 2 columns and arbitrary number of rows containing clusters assigning queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection
class. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Step 13  on Local Nodes
Parameter

Default Value

Descriptions


algorithmFPType 
float 
The floatingpoint type that the algorithm uses for intermediate computations. Can be
float
or
double
.

method 
defaultDense 
Available methods for computation of DBSCAN algorithm:

Input ID

Input


partialAssignmentQueries 
Pointer to the collection of numeric tables with 2 columns and arbitrary number of rows containing clusters assigning queries that should be processed on the local node collected from all nodes.
The input can be an object of any class derived from
DataCollection
. The numeric tables in collection can be an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Result ID

Result


step13Assignments 
Pointer to the
n
x 1 numeric table with assignments of cluster indices to observations processed on step 1 on the local node. Noise observations have the assignment equal to 1.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 
Partial Result ID

Result


step13AssignmentsQueries 
Pointer to the numeric table with 2 columns and arbitrary number of rows containing clusters assigning queries that should be processed on the local node.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable. 