在Python中使用 sklearn 的 homogeneity_score
完全同构的聚类是其中每个聚类都具有将地点指向相似类别标签的信息。同质性描述了聚类算法与此 ( homogeneity_score ) 完美性的接近程度。
该度量独立于标签的完全值。集群标签值的排列不会以任何方式改变分数值。
Syntax : sklearn.metrics.homogeneity_score(labels_true, labels_pred)
The Metric is not symmetric, switching label_true with label_pred will return the completeness_score.
Parameters :
- labels_true:<int array, shape = [n_samples]> : It accept the ground truth class labels to be used as a reference.
- labels_pred: <array-like of shape (n_samples,)>: It accepts the cluster labels to evaluate.
Returns:
homogeneity:<float>: Its return the score between 0.0 and 1.0 stands for perfectly homogeneous labeling.
示例 1:
Python3
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.metrics import homogeneity_score
# Changing the location file
# cd C:\Users\Dev\Desktop\Credit Card Fraud
# Loading the data
df = pd.read_csv('creditcard.csv')
# Separating the dependent and independent variables
y = df['Class']
X = df.drop('Class', axis=1)
# Building the clustering model
kmeans = KMeans(n_clusters=2)
# Training the clustering model
kmeans.fit(X)
# Storing the predicted Clustering labels
labels = kmeans.predict(X)
# Evaluating the performance
homogeneity_score(y, labels)
Python3
from sklearn.metrics.cluster import homogeneity_score
# Evaluate the score
hscore = homogeneity_score([0, 1, 0, 1], [1, 0, 1, 0])
print(hscore)
Python3
from sklearn.metrics.cluster import homogeneity_score
# Evaluate the score
hscore = homogeneity_score([0, 0, 1, 1], [0, 1, 2, 3])
print(hscore)
Python3
from sklearn.metrics.cluster import homogeneity_score
# Evaluate the score
hscore = homogeneity_score([0, 0, 1, 1], [0, 1, 0, 1])
print(hscore)
输出:
0.00496764949717645
示例 2:完全同质:
蟒蛇3
from sklearn.metrics.cluster import homogeneity_score
# Evaluate the score
hscore = homogeneity_score([0, 1, 0, 1], [1, 0, 1, 0])
print(hscore)
输出:
1.0
示例 3:将类进一步拆分为更多集群的非完美标签可以是完全同质的:
蟒蛇3
from sklearn.metrics.cluster import homogeneity_score
# Evaluate the score
hscore = homogeneity_score([0, 0, 1, 1], [0, 1, 2, 3])
print(hscore)
输出:
0.9999999999999999
示例 4:包含来自不同类别的样本不会用于同质标记:
蟒蛇3
from sklearn.metrics.cluster import homogeneity_score
# Evaluate the score
hscore = homogeneity_score([0, 0, 1, 1], [0, 1, 0, 1])
print(hscore)
输出:
0.0