📜  CURE 聚类和 DBSCAN 聚类的区别

📅  最后修改于: 2021-09-14 02:47:23             🧑  作者: Mango

聚类是在无监督学习中使用的一种技术,其中根据数据样本固有属性的相似性将数据样本分组到集群中。聚类也可以定义为一种将在某些方面相似的数据项进行分组的技术。属于同一簇的数据项在某种程度上彼此相似,而属于不同簇的数据项则不同。

CURE (使用代表的聚类)和DBSCAN (具有噪声的应用程序的基于密度的空间聚类)是用于无监督学习的聚类算法。 CURE是一种基于层次的聚类技术,而DBSCAN是一种基于密度的聚类技术。

这些是 CUREDBSCAN之间的一些区别:

S.No. CURE Clustering DBSCAN Clustering
1. CURE Clustering stands for Clustering Using Representatives Clustering. DBSCAN Clustering stands for Density Based Spatial Clustering of Applications with Noise Clustering.
2. It is a hierarchial based clustering technique. It is a density based clustering technique.
3. Noise handling in CURE clustering is not efficient. Noise handling in DBSCAN clustering is efficient.
4. Algorithm:
  • Draw a random sample.
  • Partition the random sample.
  • Partially cluster the partition.
  • Outliers are identified and eliminated.
  • The partial clusters obtained are clubbed into clustered.
  • Label the result on storage.
Algorithm:

  • All the data sample points are labelled as core points, border points or noise points.
  • The noise points are eliminated.
  • All the core points are connected which lie under the vicinity of Eps of each other.
  • The core points which are connected to each other are grouped into a separate cluster.
  • Border points are assigned to each clusters.
5. It can take care of high dimensional datasets. It does not work properly for high dimensional datasets.
6. Varying densities of the data points doesn’t matter in CURE clustering algorithm. It does not work properly when the data points have varying densities

治愈架构:

DBSCAN 架构:

Eps : 圆的半径
minPts :这是最小数量。必须存在于 eps 附近的点数。