如何获取 Pandas DataFrame 的描述性统计数据?
Python Pandas 中的describe()
方法用于计算描述性统计数据,如计数、唯一值、均值、标准差、最小值和最大值等等。在本文中,让我们学习获取 Pandas DataFrame 的描述性统计数据。
Syntax:
df[‘cname’].describe(percentiles = None, include = None, exclude = None)
df.describe(percentiles = None, include = None, exclude = None)
Parameters:
percentiles: represents percentile value that has to be returned by the function. Default values are 0.25, 0.5 and 0.75
include: represents list of data types that has to be included
exclude: represents list of data types that has to be excluded
示例 1:
# Import package
from pandas import DataFrame
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
'Price': [20000, 28000, 22000, 19000, 45000],
'Year': [2014, 2015, 2016, 2017, 2018]
}
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
# Original DataFrame
print("Original DataFrame:\n", df)
# Describing descriptive statistics of Price
print("\nDescriptive statistics of Price:\n")
stats = df['Price'].describe()
print(stats)
输出:
示例 2:
# Import package
from pandas import DataFrame
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
'Price': [20000, 28000, 22000, 19000, 45000],
'Year': [2014, 2015, 2016, 2017, 2018]
}
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
# Original DataFrame
print("Original DataFrame:\n", df)
# Describing descriptive statistics of Year
print("\nDescriptive statistics of year:\n")
stats = df['Year'].describe()
print(stats)
输出:
示例 3:
# Import package
from pandas import DataFrame
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
'Price': [20000, 28000, 22000, 19000, 45000],
'Year': [2014, 2015, 2016, 2017, 2018]
}
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
# Original DataFrame
print("Original DataFrame:\n", df)
# Describing descriptive statistics of whole dataframe
print("\nDescriptive statistics of whole dataframe:\n")
stats = df.describe(include = 'all')
print(stats)
输出:
示例 4:
在这个例子中,让我们单独打印所有的描述性统计数据。
from pandas import DataFrame
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
'Price': [20000, 28000, 22000, 19000, 45000],
'Year': [2014, 2015, 2016, 2017, 2018]
}
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
# Original DataFrame
print("Original DataFrame:\n", df)
# Print Count of Price
print("\nCount of Price:\n")
counts = df['Price'].count()
print(counts)
# Print mean of Price
print("\nMean of Price:\n")
m = df['Price'].mean()
print(m)
# Print maximum value of Price
print("\nMaximum value of Price:\n")
mx = df['Price'].max()
print(m)
# Print standard deviation of Price
print("\nStandard deviation of Price:\n")
sd = df['Price'].std()
print(sd)
输出: