# Density-Based Spatial Clustering of Applications with Noise

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed in [Ester96].
It is a density-based clustering non-parametric algorithm: given a set of observations in some space,
it groups together observations that are closely packed together (observations with many nearby neighbors),
marking as outliers observations that lie alone in low-density regions (whose nearest neighbors are too far away).

## Details

Given the set
of

n

*-dimensional feature vectors (further referred as observations), a positive floating-point number*p

epsilon

and a positive integer minObservations

,
the problem is to get clustering assignments for each input observation, based on the definitions below [Ester96]:- core observation
- An observation
is called core observation if at leastxminObservationsinput observations (including) are within distancexepsilonfrom observation;x - directly reachable
- An observation
is directly reachable fromyifxis within distanceyepsilonfrom core observation. Observations are only said to be directly reachable from core observations.x - reachable
- An observation
is reachable from an observationyif there is a path with and , where each is directly reachable from . This implies that all observations on the path must be core observations, with the possible exception ofx.y - noise observation
- Noise observations are observations that are not reachable from any other observation.
- cluster
- Two observations
andxare considered to be in the same cluster if there is a core observationy, andzandxare both reachable fromy.z

Each cluster gets a unique identifier, an integer number from

*to . Each observation is assigned an identifier of the cluster it belongs to, or if the observation considered to be a noise observation.*0

## Examples

C++ (CPU)

Java*

There is no support for Java on GPU.

Batch Processing:

Distributed Processing:

Python* with DPC++ support

Batch Processing:

Python*