📜  Pandas – 计算两个系列之间的欧几里得距离

📅  最后修改于: 2022-05-13 01:54:42.053000             🧑  作者: Mango

Pandas – 计算两个系列之间的欧几里得距离

各种机器学习算法中使用了许多距离度量。其中之一是欧几里得距离。欧几里得距离是最常用的距离度量,它只是两点之间的直线距离。点之间的欧几里得距离由公式给出:

      \[d(x, y) = \sqrt{\sum_{i=0}^{n}(x_{i}-y_{i})^{2}}\]

我们可以使用各种方法来计算两个系列之间的欧几里得距离。以下是一些相同的方法:
示例 1:

import pandas as pd
import numpy as np
  
  
# create pandas series
x = pd.Series([1, 2, 3, 4, 5])
y = pd.Series([6, 7, 8, 9, 10])
  
# here we are computing every thing
# step by step
p1 = np.sum([(a * a) for a in x])
p2 = np.sum([(b * b) for b in y])
  
# using zip() function to create an
# iterator which aggregates elements 
# from two or more iterables
p3 = -1 * np.sum([(2 * a*b) for (a, b) in zip(x, y)])
dist = np.sqrt(np.sum(p1 + p2 + p3))
  
print("Series 1:", x)
print("Series 2:", y)
print("Euclidean distance between two series is:", dist)

输出 :

示例 2:

import pandas as pd
import numpy as np
  
  
x = pd.Series([1, 2, 3, 4, 5])
y = pd.Series([6, 7, 8, 9, 10])
  
# zip() function creates an iterator
# which aggregates elements from two 
# or more iterables
dist = np.sqrt(np.sum([(a-b)*(a-b) for a, b in zip(x, y)]))    
  
print("Series 1:")
print(x)
  
print("Series 2:")
print(y)
  
print("Euclidean distance between two series is:", dist)

输出 :

示例 3:在此示例中,我们使用np.linalg.norm()函数,该函数返回八种不同的矩阵范数之一。

import pandas as pd
import numpy as np
  
  
x = pd.Series([1, 2, 3, 4, 5])
y = pd.Series([6, 7, 8, 9, 10])
dist = (np.linalg.norm(x-y))
  
print("Series 1:")
print(x)
  
print("Series 2:")
print(y)
  
print("Euclidean distance between two series is:", dist)

输出 :

示例 4:让我们现在尝试一个更大的系列:

import pandas as pd
import numpy as np
  
  
x = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y = pd.Series([12, 8, 7, 5, 6, 5, 3, 9, 7, 1])
dist = np.sqrt(np.sum([(a-b)*(a-b) for a, b in zip(x, y)]))
  
print("Series 1:")
print(x)
  
print("Series 2:")
print(y)
  
print("Euclidean distance between two series is:", dist)

输出 :