如何在Python中计算偏度和峰度？

偏度是一个统计术语，它是一种估计或测量分布形状的方法。它是一种重要的统计方法，用于估计不对称行为而不是计算频率分布。偏度可以有两种类型：

对称的：如果分布从中心点的左侧和右侧看起来相同，则可以称为对称分布。
不对称：如果分布从中心点的左侧和右侧看起来不一样，则可以称为非对称分布。

基于偏度值的分布：

偏度 = 0：则为正态分布。
偏度 > 0：分布左尾的权重更大。
偏度 < 0：分布的右尾权重更大。

峰度：

它也是一个统计术语，是频率分布的一个重要特征。它确定一个分布是否是正态分布的重尾分布。它提供有关频率分布形状的信息。

正态分布的峰度等于 3。
对于峰度 < 3 的分布：称为 playkurtic。
对于峰度 > 3 的分布，它被称为 leptokurtic，表示它试图产生更多的异常值而不是正态分布。

本文重点介绍如何在Python中计算偏度和峰度。

如何在Python中计算偏度和峰度？

计算偏度和峰度是一个循序渐进的过程。下面讨论这些步骤。

第 1 步：导入 SciPy 库。

SciPy 是一个开源科学图书馆。它提供了计算偏度和峰度的内置函数。我们可以使用下面的代码导入这个库。

Python3

# Importing scipy
import scipy

Python3

# Creating a dataset
dataset = [10, 25, 14, 26, 35, 45, 67, 90, 
           40, 50, 60, 10, 16, 18, 20]

Python3

# Importing library
from scipy.stats import skew
  
# Creating a dataset
dataset = [88, 85, 82, 97, 67, 77, 74, 86, 
           81, 95, 77, 88, 85, 76, 81]
  
# Calculate the skewness
print(skew(dataset, axis=0, bias=True))

Python3

# Importing library
  
from scipy.stats import kurtosis
  
# Creating a dataset
dataset = [88, 85, 82, 97, 67, 77, 74, 86,
           81, 95, 77, 88, 85, 76, 81]
  
  
# Calculate the kurtosis
print(kurtosis(dataset, axis=0, bias=True))

第 2 步：创建数据集。

在计算偏度和峰度之前，我们需要创建一个数据集。

Python3

# Creating a dataset
dataset = [10, 25, 14, 26, 35, 45, 67, 90, 
           40, 50, 60, 10, 16, 18, 20]

第 3 步：计算数据集的偏度。

我们可以使用内置的 skew()函数计算数据集的偏度。它的语法如下，

Syntax:

scipy.stats.skew(array, axis=0, bias=True)

Parameters:

array: It represents the input array (or object) containing elements.
axis: It signifies the axis along which we want to find the skewness value (By default axis = 0).
bias = False: Calculations are corrected to statistical bias.

Return Type:

Skewness value of the data set, along the axis.

编程需要懂一点英语

例子：

Python3

# Importing library
from scipy.stats import skew
  
# Creating a dataset
dataset = [88, 85, 82, 97, 67, 77, 74, 86, 
           81, 95, 77, 88, 85, 76, 81]
  
# Calculate the skewness
print(skew(dataset, axis=0, bias=True))

输出：

数据集的偏度

这表明分布是正偏态的

第 4 步：计算数据集的峰度。

我们可以使用内置的 kurtosis()函数计算数据集的峰度。它的语法如下，

Syntax:

scipy.stats.kurtosis(array, axis=0, fisher=True, bias=True)

Parameters:

array: Input array or object having the elements.
axis: It represents the axis along which the kurtosis value is to be measured. By default axis = 0.
fisher = True: The fisher’s definition will be used (normal 0.0).
fisher = False: The Pearson’s definition will be used (normal 3.0).
Bias = True: Calculations are corrected for statistical bias, if set to False.

Return Type:

Kurtosis value of the normal distribution for the data set.

编程需要懂一点英语

例子：

Python3

# Importing library
  
from scipy.stats import kurtosis
  
# Creating a dataset
dataset = [88, 85, 82, 97, 67, 77, 74, 86,
           81, 95, 77, 88, 85, 76, 81]
  
  
# Calculate the kurtosis
print(kurtosis(dataset, axis=0, bias=True))

输出：

数据集的峰度

它表示与正态分布相比，该分布在尾部具有更多的值。