📅  最后修改于: 2020-10-29 01:55:47             🧑  作者: Mango
describe()方法用于计算一些统计数据,例如Series或DataFrame的数值的百分位数,均值和标准差。它分析数字和对象系列以及混合数据类型的DataFrame列集。
DataFrame.describe(percentiles=None, include=None, exclude=None)
它返回Series和DataFrame的统计摘要。
import pandas as pd
import numpy as np
a1 = pd.Series([1, 2, 3])
a1.describe()
输出量
count 3.0
mean 2.0
std 1.0
min 1.0
25% 1.5
50% 2.0
75% 2.5
max 3.0
dtype: float64
import pandas as pd
import numpy as np
a1 = pd.Series(['p', 'q', 'q', 'r'])
a1.describe()
输出量
count 4
unique 3
top q
freq 2
dtype: object
import pandas as pd
import numpy as np
a1 = pd.Series([1, 2, 3])
a1.describe()
a1 = pd.Series(['p', 'q', 'q', 'r'])
a1.describe()
info = pd.DataFrame({'categorical': pd.Categorical(['s','t','u']),
'numeric': [1, 2, 3],
'object': ['p', 'q', 'r']
})
info.describe(include=[np.number])
info.describe(include=[np.object])
info.describe(include=['category'])
输出量
categorical
count 3
unique 3
top u
freq 1
import pandas as pd
import numpy as np
a1 = pd.Series([1, 2, 3])
a1.describe()
a1 = pd.Series(['p', 'q', 'q', 'r'])
a1.describe()
info = pd.DataFrame({'categorical': pd.Categorical(['s','t','u']),
'numeric': [1, 2, 3],
'object': ['p', 'q', 'r']
})
info.describe()
info.describe(include='all')
info.numeric.describe()
info.describe(include=[np.number])
info.describe(include=[np.object])
info.describe(include=['category'])
info.describe(exclude=[np.number])
info.describe(exclude=[np.object])
输出量
categorical numeric
count 3 3.0
unique 3 NaN
top u NaN
freq 1 NaN
mean NaN 2.0
std NaN 1.0
min NaN 1.0
25% NaN 1.5
50% NaN 2.0
75% NaN 2.5
max NaN 3.0