Distributed Processing
Step 1  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

blockIndex  Not applicable
 Unique identifier of block initially passed for computation on the local node.

nBlocks  Not applicable
 Number of blocks initially passed for computation on all nodes.

Input ID
 Input


step1Data  Pointer to the
n x
p numeric table with the observations to be clustered. The input can be an object of any class derived from
NumericTable .

Partial Result ID
 Result


partialOrder  Pointer to the
n x 2 numeric table containing information about observations: identifier of initial block and index in initial block. This information will be required to reconstruct initial blocks after transferring observations among nodes.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 2  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

blockIndex  Not applicable
 Unique identifier of block initially passed for computation on the local node.

nBlocks  Not applicable
 Number of blocks initially passed for computation on all nodes.

Input ID
 Input


partialData  Pointer to the collection of numeric tables with
p columns and arbitrary number of rows, containing observations to be clustered. The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable .

Partial Result ID
 Result


boundingBox  Pointer to the 2 x
p numeric table containing bounding box of input observations: first row contains minimum value of each feature, second row contains maximum value of each feature.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 3  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

leftBlocks  Not applicable
 Number of blocks that will process observations with value of selected split feature smaller than selected split value.

rightBlocks  Not applicable
 Number of blocks that will process observations with value of selected split feature greater than selected split value.

Input ID
 Input


partialData  Pointer to the collection of numeric tables with
p columns and arbitrary number of rows, containing observations to be clustered. The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable .

step3PartialBoundingBoxes  Pointer to the collection of the 2 x
p numeric tables containing bounding boxes computed on step 2 and collected from all nodes participating in current iteration of geometric repartitioning process. The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable .

Partial Result ID
 Result


split  Pointer to the 1 x 2 numeric table containing information about split for current iteration of geometric repartitioning.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 4  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

leftBlocks  Not applicable
 Number of blocks that will process observations with value of selected split feature smaller than selected split value.

rightBlocks  Not applicable
 Number of blocks that will process observations with value of selected split feature greater than selected split value.

Input ID
 Input


partialData  Pointer to the collection of numeric tables with
p columns and arbitrary number of rows, containing observations to be clustered.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable .

step4PartialOrders  Pointer to the collection of numeric table with 2 columns and arbitrary number of rows containing information about observations: identifier of initial block and index in initial block.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable .

step4PartialSplits  Pointer to the collection of the 1 x 2 numeric table containing information about split computed on step 3 and collected from all nodes participating in current iteration of geometric repartitioning process.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable .

Partial Result ID
 Result


partitionedData  Pointer to the collection of (leftBlocks + rightBlocks) numeric tables with
p columns and arbitrary number of rows containing observations for processing on nodes participating in current iteration of geometric repartitioning. First leftBlocks numeric tables in collection have the value of selected split feature smaller than selected split value. Next rightBlocks numeric tables in collection have the value of selected split feature larger than selected split value.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 5  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

blockIndex  Not applicable
 Unique identifier of block initially passed for computation on the local node.

nBlocks  Not applicable
 Number of blocks initially passed for computation on all nodes.

epsilon  Not applicable
 The maximum distance between observations lying in the same neighborhood.

Input ID
 Input


partialData  Pointer to the collection of numeric tables with
p columns and arbitrary number of rows, containing observations to be clustered.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable .

step5PartialBoundingBoxes  Pointer to the collection of 2 x
p numeric table containing bounding boxes computed on step 2 and collected from all nodes. Numeric tables in collection should be ordered by the identifiers of initial block of nodes.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable .

Partial Result ID
 Result


partitionedHaloData  Pointer to the collection of nBlocks numeric tables with
p columns and arbitrary number of rows containing observations from current node that should be used as halo observations on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection class. The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
partitionedHaloDataIndices  Pointer to the collection of nBlocks numeric tables with 1 column and arbitrary number of rows containing indices of observations from current node that should be used as halo observations on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection class. The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 6  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

blockIndex  Not applicable
 Unique identifier of block initially passed for computation on the local node.

nBlocks  Not applicable
 Number of blocks initially passed for computation on all nodes.

epsilon  Not applicable
 The maximum distance between observations lying in the same neighborhood.

minObservations  Not applicable
 The number of observations in a neighborhood for an observation to be considered as a core.

memorySavingMode  false  If flag is set to false, all neighborhoods will be computed and stored prior to clustering. It will require up to O(sum of sizes of neighborhoods) additional memory, which in worst case can be O(number of observations^2), but in common case performance may be better.

Input ID
 Input


partialData  Pointer to the collection of numeric tables with
p columns and arbitrary number of rows, containing observations to be clustered.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable .

haloData  Pointer to the collection of numeric tables with p columns and arbitrary number of rows, containing halo observations for current node computed on step 5.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable .

haloDataIndices  Pointer to the collection of numeric tables with 1 column and arbitrary number of rows, containing indices for halo observations for current node computed on step 5. Size of this collection should be equal to the size of collection for haloData Input ID.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable .

haloDataBlocks  Pointer to the collection of 1 x 1 numeric tables containing identifiers of initial block for halo observations for current node computed on step 5. Size of this collection should be equal to the size of collection for haloData Input ID.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable .

Partial Result ID
 Result


step6ClusterStructure  Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step6FinishedFlag  Pointer to 1 x 1 numeric table containing the flag indicating that the clustering process is finished for current node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step6NClusters  Pointer to 1 x 1 numeric table containing the current number of clusters found on the local node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step6Queries  Pointer to the collection of nBlocks numeric tables with 3 columns and arbitrary number of rows containing clustering queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection class. The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 7  on Master Node
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

Input ID
 Input


partialFinishedFlags  Pointer to the collection of 1 x 1 numeric table containing the flag indicating that the clustering process is finished collected from all nodes.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Partial Result ID
 Result


finishedFlag  Pointer to 1 x 1 numeric table containing the flag indicating that the clustering process is finished on all nodes.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 8  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

blockIndex  Not applicable
 Unique identifier of block initially passed for computation on the local node.

nBlocks  Not applicable
 Number of blocks initially passed for computation on all nodes.

Input ID
 Input


step8InputClusterStructure  Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
The input can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step8InputNClusters  Pointer to 1 x 1 numeric tables containing the current number of clusters found on the local node.
The input can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step8PartialQueries  Pointer to the collection of numeric tables with 3 columns and arbitrary number of rows containing clustering queries that should be processed on the local node collected from all nodes.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Partial Result ID
 Result


step8ClusterStructure  Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step8FinishedFlag  Pointer to 1 x 1 numeric table containing the flag indicating that the clustering process is finished for current node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step8NClusters  Pointer to 1 x 1 numeric table containing the current number of clusters found on the local node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step8Queries  Pointer to the collection of nBlocks numeric tables with 3 columns and arbitrary number of rows containing clustering queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection class. The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 9  on Master Node
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

Input ID
 Input


partialNClusters  Pointer to the collection of 1 x 1 numeric table containing the number of clusters found on each node.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Result ID
 Result


step9NClusters  Pointer to 1 x 1 numeric table containing the flag indicating that the clustering process is finished on all nodes.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Partial Result ID
 Result


clusterOffsets  Pointer to the collection of 1 x 1 numeric tables containing offsets for cluster numeration for each node. Numeric tables with offsets are given in the same order as in the collection for partialNClusters Input ID.
By default, this result is an object of the
DataCollection class. The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 10  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

blockIndex  Not applicable
 Unique identifier of block initially passed for computation on the local node.

nBlocks  Not applicable
 Number of blocks initially passed for computation on all nodes.

Input ID
 Input


step10InputClusterStructure  Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
The input can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step10ClusterOffset  Pointer to 1 x 1 numeric table containing the offset for cluster numeration on the local node computed on step 9.
The input can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Partial Result ID
 Result


step10ClusterStructure  Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step10FinishedFlag  Pointer to 1 x 1 numeric table containing the flag indicating that the clusters numeration process is finished for current node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step10Queries  Pointer to the collection of nBlocks numeric tables with 4 columns and arbitrary number of rows containing clusters numeration queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection class. The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 11  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

blockIndex  Not applicable
 Unique identifier of block initially passed for computation on the local node.

nBlocks  Not applicable
 Number of blocks initially passed for computation on all nodes.

Input ID
 Input


step11InputClusterStructure  Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
The input can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step11PartialQueries  Pointer to the collection of numeric tables with 4 columns and arbitrary number of rows containing clusters numeration queries that should be processed on the local node collected from all nodes.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Partial Result ID
 Result


step11ClusterStructure  Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step11FinishedFlag  Pointer to 1 x 1 numeric table containing the flag indicating that the clusters numeration process is finished for current node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step11Queries  Pointer to the collection of nBlocks numeric tables with 4 columns and arbitrary number of rows containing clusters numeration queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection class. The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 12  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

blockIndex  Not applicable
 Unique identifier of block initially passed for computation on the local node.

nBlocks  Not applicable
 Number of blocks initially passed for computation on all nodes.

Input ID
 Input


step12InputClusterStructure  Pointer to the numeric table with 4 columns and arbitrary number of rows containing information about current clustering state of observations processed on the local node.
The input can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
step12PartialOrders  Pointer to the collection of
n x 2 numeric tables containing information about observations: identifier of initial block and index in initial block. This information will be required to reconstruct initial blocks after transferring observations among nodes.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Partial Result ID
 Result


assignmentQueries  Pointer to the collection of nBlocks numeric tables with 2 columns and arbitrary number of rows containing clusters assigning queries that should be processed on each node. Numeric tables in collection ordered by the identifiers of initial block of nodes.
By default, this result is an object of the
DataCollection class. The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Step 13  on Local Nodes
Parameter
 Default Value
 Descriptions


algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .

method  defaultDense  Available methods for computation of DBSCAN algorithm:

Input ID
 Input


partialAssignmentQueries  Pointer to the collection of numeric tables with 2 columns and arbitrary number of rows containing clusters assigning queries that should be processed on the local node collected from all nodes.
The input can be an object of any class derived from
DataCollection . The numeric tables in collection can be an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Result ID
 Result


step13Assignments  Pointer to the
n x 1 numeric table with assignments of cluster indices to observations processed on step 1 on the local node. Noise observations have the assignment equal to 1.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 
Partial Result ID
 Result


step13AssignmentsQueries  Pointer to the numeric table with 2 columns and arbitrary number of rows containing clusters assigning queries that should be processed on the local node.
By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except for
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable. 