Pandas – 计算两个系列之间的欧几里得距离
各种机器学习算法中使用了许多距离度量。其中之一是欧几里得距离。欧几里得距离是最常用的距离度量,它只是两点之间的直线距离。点之间的欧几里得距离由公式给出:
我们可以使用各种方法来计算两个系列之间的欧几里得距离。以下是一些相同的方法:
示例 1:
import pandas as pd
import numpy as np
# create pandas series
x = pd.Series([1, 2, 3, 4, 5])
y = pd.Series([6, 7, 8, 9, 10])
# here we are computing every thing
# step by step
p1 = np.sum([(a * a) for a in x])
p2 = np.sum([(b * b) for b in y])
# using zip() function to create an
# iterator which aggregates elements
# from two or more iterables
p3 = -1 * np.sum([(2 * a*b) for (a, b) in zip(x, y)])
dist = np.sqrt(np.sum(p1 + p2 + p3))
print("Series 1:", x)
print("Series 2:", y)
print("Euclidean distance between two series is:", dist)
输出 :
示例 2:
import pandas as pd
import numpy as np
x = pd.Series([1, 2, 3, 4, 5])
y = pd.Series([6, 7, 8, 9, 10])
# zip() function creates an iterator
# which aggregates elements from two
# or more iterables
dist = np.sqrt(np.sum([(a-b)*(a-b) for a, b in zip(x, y)]))
print("Series 1:")
print(x)
print("Series 2:")
print(y)
print("Euclidean distance between two series is:", dist)
输出 :
示例 3:在此示例中,我们使用np.linalg.norm()函数,该函数返回八种不同的矩阵范数之一。
import pandas as pd
import numpy as np
x = pd.Series([1, 2, 3, 4, 5])
y = pd.Series([6, 7, 8, 9, 10])
dist = (np.linalg.norm(x-y))
print("Series 1:")
print(x)
print("Series 2:")
print(y)
print("Euclidean distance between two series is:", dist)
输出 :
示例 4:让我们现在尝试一个更大的系列:
import pandas as pd
import numpy as np
x = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y = pd.Series([12, 8, 7, 5, 6, 5, 3, 9, 7, 1])
dist = np.sqrt(np.sum([(a-b)*(a-b) for a, b in zip(x, y)]))
print("Series 1:")
print(x)
print("Series 2:")
print(y)
print("Euclidean distance between two series is:", dist)
输出 :