📜  Python统计 |方差()

📅  最后修改于: 2022-05-13 01:54:52.149000             🧑  作者: Mango

Python统计 |方差()

统计模块提供了非常强大的工具,可以用来计算任何与统计相关的东西。 Variation()就是这样一个函数。此函数有助于计算数据样本的方差(样本是填充数据的子集)。
仅当需要计算样本的方差时才应使用variance()函数。还有另一个称为 pvariance() 的函数,用于计算整个总体的方差。
在纯统计中,方差是变量与其均值的平方偏差。基本上,它从平均值或中值测量一组随机数据的分布。较低的方差值表明数据聚集在一起并且没有广泛分散,而较高的值表明给定集合中的数据与平均值相比分散得更多。
方差是科学中的重要工具,其中数据的统计分析很常见。它是给定数据集的标准差的平方,也称为分布的第二中心矩。它通常表示为s^{2}, \sigma ^{2}, \operatorname {Var} (X) 在纯统计中。
方差由以下公式计算:

代码#1:

Python3
# Python code to demonstrate the working of
# variance() function of Statistics Module
 
# Importing Statistics module
import statistics
 
# Creating a sample of data
sample = [2.74, 1.23, 2.63, 2.22, 3, 1.98]
 
# Prints variance of the sample set
 
# Function will automatically calculate
# it's mean and set it as xbar
print("Variance of sample set is % s"
      %(statistics.variance(sample)))


Python3
# Python code to demonstrate variance()
# function on varying range of data-types
 
# importing statistics module
from statistics import variance
 
# importing fractions as parameter values
from fractions import Fraction as fr
 
# tuple of a set of positive integers
# numbers are spread apart but not very much
sample1 = (1, 2, 5, 4, 8, 9, 12)
 
# tuple of a set of negative integers
sample2 = (-2, -4, -3, -1, -5, -6)
 
# tuple of a set of positive and negative numbers
# data-points are spread apart considerably
sample3 = (-9, -1, -0, 2, 1, 3, 4, 19)
 
# tuple of a set of fractional numbers
sample4 = (fr(1, 2), fr(2, 3), fr(3, 4),
                     fr(5, 6), fr(7, 8))
 
# tuple of a set of floating point values
sample5 = (1.23, 1.45, 2.1, 2.2, 1.9)
 
# Print the variance of each samples
print("Variance of Sample1 is % s " %(variance(sample1)))
print("Variance of Sample2 is % s " %(variance(sample2)))
print("Variance of Sample3 is % s " %(variance(sample3)))
print("Variance of Sample4 is % s " %(variance(sample4)))
print("Variance of Sample5 is % s " %(variance(sample5)))


Python3
# Python code to demonstrate
# the use of xbar parameter
 
# Importing statistics module
import statistics
 
# creating a sample list
sample = (1, 1.3, 1.2, 1.9, 2.5, 2.2)
 
# calculating the mean of sample set
m = statistics.mean(sample)
 
 
# calculating the variance of sample set
print("Variance of Sample set is % s"
    %(statistics.variance(sample, xbar = m)))


Python3
# Python code to demonstrate the error caused
# when garbage value of xbar is entered
 
# Importing statistics module
import statistics
 
# creating a sample list
sample = (1, 1.3, 1.2, 1.9, 2.5, 2.2)
 
# calculating the mean of sample set
m = statistics.mean(sample)
 
# Actual value of mean after calculation
# comes out to 1.6833333333333333
# But to demonstrate xbar error let's enter
# -100 as the value for xbar parameter
print(statistics.variance(sample, xbar = -100))


Python3
# Python code to demonstrate StatisticsError
 
# importing Statistics module
import statistics
 
# creating an empty data-srt
sample = []
 
# will raise Statistics Error
print(statistics.variance(sample))


输出 :

Variance of sample set is 0.40924


代码#2:在一系列数据类型上演示variance()

Python3

# Python code to demonstrate variance()
# function on varying range of data-types
 
# importing statistics module
from statistics import variance
 
# importing fractions as parameter values
from fractions import Fraction as fr
 
# tuple of a set of positive integers
# numbers are spread apart but not very much
sample1 = (1, 2, 5, 4, 8, 9, 12)
 
# tuple of a set of negative integers
sample2 = (-2, -4, -3, -1, -5, -6)
 
# tuple of a set of positive and negative numbers
# data-points are spread apart considerably
sample3 = (-9, -1, -0, 2, 1, 3, 4, 19)
 
# tuple of a set of fractional numbers
sample4 = (fr(1, 2), fr(2, 3), fr(3, 4),
                     fr(5, 6), fr(7, 8))
 
# tuple of a set of floating point values
sample5 = (1.23, 1.45, 2.1, 2.2, 1.9)
 
# Print the variance of each samples
print("Variance of Sample1 is % s " %(variance(sample1)))
print("Variance of Sample2 is % s " %(variance(sample2)))
print("Variance of Sample3 is % s " %(variance(sample3)))
print("Variance of Sample4 is % s " %(variance(sample4)))
print("Variance of Sample5 is % s " %(variance(sample5)))

输出 :

Variance of Sample 1 is 15.80952380952381 
Variance of Sample 2 is 3.5 
Variance of Sample 3 is 61.125 
Variance of Sample 4 is 1/45 
Variance of Sample 5 is 0.17613000000000006 


代码#3:演示 xbar 参数的使用

Python3

# Python code to demonstrate
# the use of xbar parameter
 
# Importing statistics module
import statistics
 
# creating a sample list
sample = (1, 1.3, 1.2, 1.9, 2.5, 2.2)
 
# calculating the mean of sample set
m = statistics.mean(sample)
 
 
# calculating the variance of sample set
print("Variance of Sample set is % s"
    %(statistics.variance(sample, xbar = m)))

输出 :

Variance of Sample set is 0.3656666666666667


代码 #4 :xbar的值与平均值/平均值不同时显示错误

Python3

# Python code to demonstrate the error caused
# when garbage value of xbar is entered
 
# Importing statistics module
import statistics
 
# creating a sample list
sample = (1, 1.3, 1.2, 1.9, 2.5, 2.2)
 
# calculating the mean of sample set
m = statistics.mean(sample)
 
# Actual value of mean after calculation
# comes out to 1.6833333333333333
# But to demonstrate xbar error let's enter
# -100 as the value for xbar parameter
print(statistics.variance(sample, xbar = -100))

输出 :

0.3656666666663053

注意:它的精度与代码#3 中的输出不同代码 #4:展示 StatisticsError

Python3

# Python code to demonstrate StatisticsError
 
# importing Statistics module
import statistics
 
# creating an empty data-srt
sample = []
 
# will raise Statistics Error
print(statistics.variance(sample))

输出 :

Traceback (most recent call last):
  File "/home/64bf6d80f158b65d2b75c894d03a7779.py", line 10, in 
    print(statistics.variance(sample))
  File "/usr/lib/python3.5/statistics.py", line 555, in variance
    raise StatisticsError('variance requires at least two data points')
statistics.StatisticsError: variance requires at least two data points


应用:
方差是统计和处理大量数据的一个非常重要的工具。就像,当全知均值未知(样本均值)时,方差被用作有偏估计量。真实世界的观察,例如一家公司全天所有股票的涨跌值,不可能是所有可能的观察结果。因此,方差是从一组有限的数据中计算出来的,尽管在考虑到整个人口的情况下计算时它不会匹配,但它仍然会给用户一个足以消除其他计算的估计。