📅  最后修改于: 2023-12-03 15:17:11.259000             🧑  作者: Mango
K最近邻居(K Nearest Neighbors, KNN)是一种常用的分类方法,它的基本原理是计算一个给定点与训练集中各个点之间的距离,然后选取距离最小的K个点,根据它们所属类别的多数投票结果,将该点归为最多的那一类。
KNN分类分为以下几个步骤:
以下是一个简单的KNN分类器实现,其中包含如下主要函数:
euclidean_distance()
:计算两点之间的欧几里得距离;get_neighbors()
:获取K个最近邻居;predict_classification()
:根据最近邻投票结果进行分类预测。def euclidean_distance(row1, row2):
distance = 0.0
for i in range(len(row1)-1):
distance += (row1[i] - row2[i])**2
return sqrt(distance)
def get_neighbors(train, test_row, num_neighbors):
distances = []
for train_row in train:
dist = euclidean_distance(test_row, train_row)
distances.append((train_row, dist))
distances.sort(key=lambda tup: tup[1])
neighbors = []
for i in range(num_neighbors):
neighbors.append(distances[i][0])
return neighbors
def predict_classification(train, test_row, num_neighbors):
neighbors = get_neighbors(train, test_row, num_neighbors)
output_values = [row[-1] for row in neighbors]
prediction = max(set(output_values), key=output_values.count)
return prediction
使用上述的KNN分类器,可以通过以下步骤进行使用:
predict_classification()
函数进行分类预测;# load data
train = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
test = [[1.1,3.1],
[4.1,1.1],
[2.0,8.0],
[5.1,5.6]]
# generate predictions
for row in test:
prediction = predict_classification(train, row, num_neighbors=3)
print(f"Expected {row[-1]}, Got {prediction}")
输出结果为:
Expected 0, Got 0
Expected 0, Got 0
Expected 0, Got 0
Expected 0, Got 1
由输出结果可以看出,该KNN分类器预测准确率为75%。