Python|熊猫 dataframe.corr()

Python是一种用于进行数据分析的出色语言，主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一，它使导入和分析数据变得更加容易。

Pandas dataframe.corr()用于查找数据框中所有列的成对相关性。任何na值都会被自动排除。对于数据框中的任何非数字数据类型列，它都会被忽略。

Syntax: DataFrame.corr(self, method=’pearson’, min_periods=1)

Parameters:
method :
pearson : standard correlation coefficient
kendall : Kendall Tau correlation coefficient
spearman : Spearman rank correlation
min_periods : Minimum number of observations required per pair of columns to have a valid result. Currently only available for pearson and spearman correlation

Returns: count :y : DataFrame

编程需要懂一点英语

注：变量与自身的相关性为 1。

有关代码中使用的 CSV 文件的链接，请单击此处

示例 #1：使用corr()函数使用 'Pearson' 方法查找数据框中列之间的相关性。

# importing pandas as pd
import pandas as pd
  
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
  
# Printing the first 10 rows of the data frame for visualization
df[:10]

现在使用corr()函数来查找列之间的相关性。我们在数据框中只有四个数字列。

# To find the correlation among
# the columns using pearson method
df.corr(method ='pearson')

输出：

输出数据帧可以解释为任何单元格，与列变量相关的行变量是单元格的值。如前所述，变量与自身的相关性为 1。因此所有对角线值为 1.00示例 #2：使用corr()函数使用 'kendall' 方法查找数据框中列之间的相关性。

# importing pandas as pd
import pandas as pd
  
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
  
# To find the correlation among
# the columns using kendall method
df.corr(method ='kendall')

输出：

输出数据帧可以解释为任何单元格，与列变量相关的行变量是单元格的值。如前所述，变量与自身的相关性为 1。因此，所有对角线值为 1.00。