public class DBScanMapReduce extends Object
Clusters are merged if they share neighbors in common and both clusters meet the minimum size constraints.
Clusters may be made up of points or geometries. When processing geometries, the closest two points are included in the cluster, not the entire geometry. The reason for this is that geometries may span large areas. This technique has a disadvantage of mis-representing dense segments as a dense set of points.
The design uses two level partitioning, working within the confines of @{link NNProcessor}. Performance gains and memory constraints are accomplished through a pre-processing step.
Pre-processing first finds dense clusters, replacing each dense cluster with a concave polygon. Although not very scientific, the condensing process the minimum condensed cluster size is between 50 and 200, depending on the setting of the minimum owners. The choice is some what arbitrary. Retaining individual points for clusters larger than 200 often creates memory concerns. However, there is little value in condensing below 50 as that indicates a fairly small cluster, which does not contribute to a performance concern. Override 'calculateCondensingMinimum ()' to come up with a different approach.
Pre-processing also finds cluster centers that have less than the minimum and tosses those centers. There is a caution here. Clusters of this type can fall on the 'edge' of dense clusters, thus 'tightening' the dense regions. It does effectively remove outliers. Alter the approach by over-riding 'calculateTossMinimum()' (e.g. make it a smaller number like 0 or 1).
Modifier and Type | Class and Description |
---|---|
static class |
DBScanMapReduce.DBScanMapHullReducer |
static class |
DBScanMapReduce.DBScanMapReducer<KEYOUT,VALUEOUT> |
static class |
DBScanMapReduce.SimpleFeatureToClusterItemConverter |
Modifier and Type | Field and Description |
---|---|
protected static org.slf4j.Logger |
LOGGER |
Constructor and Description |
---|
DBScanMapReduce() |
Copyright © 2013–2020. All rights reserved.