先决条件:分类和聚类
正如您已阅读有关分类和聚类的文章,这里是它们之间的区别。
分类和聚类都用于根据特征将对象分类为一个或多个类。它们似乎是一个相似的过程,因为基本差异很小。在分类的情况下,根据每个输入实例的属性为每个输入实例分配了预定义的标签,而在聚类中这些标签丢失了。
分类和聚类的比较:
Paramenter | CLASSIFICATION | CLUSTERING |
---|---|---|
Type | used for supervised learning | used for unsupervised learning |
Basic | process of classifying the input instances based on their corresponding class labels | grouping the instances based on their similarity without the help of class labels |
Need | it has labels so there is need of training and testing dataset for verifying the model created | there is no need of training and testing dataset |
Complexity | more complex as compared to clustering | less complex as compared to classification |
Example Algorithms | Logistic regression, Naive Bayes classifier, Support vector machines etc. | k-means clustering algorithm, Fuzzy c-means clustering algorithm, Gaussian (EM) clustering algorithm etc. |
分类和聚类之间的差异
- 分类用于监督学习,而聚类用于无监督学习。
- 根据对应的类标签对输入实例进行分类的过程称为分类,而在没有类标签的帮助下根据实例的相似性对实例进行分组的过程称为聚类。
- 由于分类有标签,因此需要训练和测试数据集来验证创建的模型,但在聚类中不需要训练和测试数据集。
- 与聚类相比,分类更复杂,因为分类阶段有很多层次,而聚类中只进行分组。
- 分类示例有逻辑回归、朴素贝叶斯分类器、支持向量机等。 而聚类示例有 k 均值聚类算法、模糊 c 均值聚类算法、高斯 (EM) 聚类算法等。