📜  Python统计 |方差()

📅  最后修改于: 2022-05-13 01:55:06.707000             🧑  作者: Mango

Python统计 |方差()

先决条件: Python统计 |方差()
pvariance()函数有助于计算整体的方差,而不是样本的方差。 Variation() 和 pvariance() 之间的唯一区别是,在使用 variance() 时,只考虑样本均值,而在 pvariance() 期间,则考虑整个总体的均值。
总体方差类似于样本方差,它说明了特定总体中的数据点是如何分布的。它是从数据点到数据集平均值的距离的平均值,平方。总体方差是总体参数,不依赖于研究方法或抽样实践。

代码#1:

Python3
# Python code to demonstrate the
# use of pvariance()
 
# importing statistics module
import statistics
 
# creating a random population list
population = (1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.9, 2.2,
              2.3, 2.4, 2.6, 2.9, 3.0, 3.4, 3.3, 3.2)
 
 
# Prints the population variance
print("Population variance is %s"
      %(statistics.pvariance(population)))


Python3
# Python code to demonstrate pvariance()
# on various range of population sets
 
# importing statistics module
from statistics import pvariance
 
# importing fractions module as F
from fractions import Fraction as F
 
 
# Population tree for a set of positive integers
pop1 = (1, 2, 3, 5, 4, 6, 1, 2, 2, 3, 1, 3,
         7, 8, 9, 1, 1, 1, 2, 6, 7, 8, 9, )
 
# Creating a population tree for
# a set of negative integers
pop2 = (-36, -35, -34, -32, -30, -31, -33, -33, -33,
             -38, -36, -35, -34, -38, -40, -31, -32)
 
# Creating a population tree for
# a set of fractional numbers
pop3 = (F(1, 3), F(2, 4), F(2, 3),
        F(3, 2), F(2, 5), F(2, 2),
        F(1, 1), F(1, 4), F(1, 2), F(2, 1))
 
# Creating a population tree for
# a set of decimal values
pop4 = (3.45, 3.2, 2.5, 4.6, 5.66, 6.43,
        4.32, 4.23, 6.65, 7.87, 9.87, 1.23,
            1.00, 1.45, 10.12, 12.22, 19.88)
 
# Print the population variance for
# the created population trees
print("Population variance of set 1 is % s"
                        %(pvariance(pop1)))
                         
print("Population variance of set 2 is % s"
                        %(pvariance(pop2)))
                         
print("Population variance of set 3 is % s"
                        %(pvariance(pop3)))
                         
print("Population variance of set 4 is % s"
                        %(pvariance(pop4)))


Python3
# Python code to demonstrate the use
#  of 'mu' parameter on pvariance()
 
# importing statistics module
import statistics
 
# Apparently, the Python interpreter doesn't
# even check whether the value entered for mu
# is the actual mean of data-set or not.
# Thus providing incorrect value would
# lead to impossible answers
 
# Creating a population tree of the
# age of kids in a locality
tree = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
        12, 12, 12, 12, 13, 1, 2, 12, 2, 2,
              2, 3, 4, 5, 5, 5, 5, 6, 6, 6)
 
# Finding the mean of population tree
m = statistics.mean(tree)
 
# Using the mu parameter
# while using pvariance()
print("Population Variance is % s"
      %(statistics.pvariance(tree, mu = m)))


Python3
# Python code to demonstrate the
# difference between pvariance()
# and variance()
 
# importing statistocs module
import statistics
 
# Population tree and extract
# a sample from it
tree = (1.1, 1.22, .23, .55, .67, 2.33, 2.81,
             1.54, 1.2, 0.2, 0.1, 1.22, 1.61)
 
# Sample extract from population tree
sample = (1.22, .23, .55, .67, 2.33,
               2.81, 1.54, 1.2, 0.2)
 
 
# Print sample variance and as
# well as population variance
print ("Variance of whole popuation is %s"
            %(statistics.pvariance(tree)))
             
print ("Variance of sample from population is %s "
                 % (statistics.variance(sample)))
 
# Print the difference in both population
# variance and sample variance
print("\n")
 
print("Difference in Population variance"
            "and Sample variance is % s"
        %(abs(statistics.pvariance(tree)
        - statistics.variance(sample))))


Python3
# Python code to demonstrate StatisticsError
 
# importing statistics module
import statistics
 
# creating an empty population set
pop = ()
 
# will raise StatisticsError
print(statistics.pvariance(pop))


输出 :

Population variance is 0.6658984375


代码 #2:在不同范围的人口树上演示 pvariance()。

Python3

# Python code to demonstrate pvariance()
# on various range of population sets
 
# importing statistics module
from statistics import pvariance
 
# importing fractions module as F
from fractions import Fraction as F
 
 
# Population tree for a set of positive integers
pop1 = (1, 2, 3, 5, 4, 6, 1, 2, 2, 3, 1, 3,
         7, 8, 9, 1, 1, 1, 2, 6, 7, 8, 9, )
 
# Creating a population tree for
# a set of negative integers
pop2 = (-36, -35, -34, -32, -30, -31, -33, -33, -33,
             -38, -36, -35, -34, -38, -40, -31, -32)
 
# Creating a population tree for
# a set of fractional numbers
pop3 = (F(1, 3), F(2, 4), F(2, 3),
        F(3, 2), F(2, 5), F(2, 2),
        F(1, 1), F(1, 4), F(1, 2), F(2, 1))
 
# Creating a population tree for
# a set of decimal values
pop4 = (3.45, 3.2, 2.5, 4.6, 5.66, 6.43,
        4.32, 4.23, 6.65, 7.87, 9.87, 1.23,
            1.00, 1.45, 10.12, 12.22, 19.88)
 
# Print the population variance for
# the created population trees
print("Population variance of set 1 is % s"
                        %(pvariance(pop1)))
                         
print("Population variance of set 2 is % s"
                        %(pvariance(pop2)))
                         
print("Population variance of set 3 is % s"
                        %(pvariance(pop3)))
                         
print("Population variance of set 4 is % s"
                        %(pvariance(pop4)))

输出 :

Population variance of set 1 is 7.913043478260869
Population variance of set 2 is 7.204152249134948
Population variance of set 3 is 103889/360000
Population variance of set 4 is 21.767923875432526


代码#3:演示mu参数的使用。

Python3

# Python code to demonstrate the use
#  of 'mu' parameter on pvariance()
 
# importing statistics module
import statistics
 
# Apparently, the Python interpreter doesn't
# even check whether the value entered for mu
# is the actual mean of data-set or not.
# Thus providing incorrect value would
# lead to impossible answers
 
# Creating a population tree of the
# age of kids in a locality
tree = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
        12, 12, 12, 12, 13, 1, 2, 12, 2, 2,
              2, 3, 4, 5, 5, 5, 5, 6, 6, 6)
 
# Finding the mean of population tree
m = statistics.mean(tree)
 
# Using the mu parameter
# while using pvariance()
print("Population Variance is % s"
      %(statistics.pvariance(tree, mu = m)))

输出 :

Population Variance is 14.30385015608741


代码 #4:演示 pvariance() 和 variance() 之间的区别

Python3

# Python code to demonstrate the
# difference between pvariance()
# and variance()
 
# importing statistocs module
import statistics
 
# Population tree and extract
# a sample from it
tree = (1.1, 1.22, .23, .55, .67, 2.33, 2.81,
             1.54, 1.2, 0.2, 0.1, 1.22, 1.61)
 
# Sample extract from population tree
sample = (1.22, .23, .55, .67, 2.33,
               2.81, 1.54, 1.2, 0.2)
 
 
# Print sample variance and as
# well as population variance
print ("Variance of whole popuation is %s"
            %(statistics.pvariance(tree)))
             
print ("Variance of sample from population is %s "
                 % (statistics.variance(sample)))
 
# Print the difference in both population
# variance and sample variance
print("\n")
 
print("Difference in Population variance"
            "and Sample variance is % s"
        %(abs(statistics.pvariance(tree)
        - statistics.variance(sample))))

输出 :

Variance of the whole popuation is 0.6127751479289941
Variance of the sample from population is 0.8286277777777779 

Difference in Population variance and Sample variance is 0.21585262984878373

注意:我们可以从上面的示例中看到,Population Variance 和 Sample Variance 并没有太大的差异。代码 #5:展示 StatisticsError

Python3

# Python code to demonstrate StatisticsError
 
# importing statistics module
import statistics
 
# creating an empty population set
pop = ()
 
# will raise StatisticsError
print(statistics.pvariance(pop))

输出 :

Traceback (most recent call last):
  File "/home/fa112e1405f09970eeddd48214318a3c.py", line 10, in 
    print(statistics.pvariance(pop))
  File "/usr/lib/python3.5/statistics.py", line 603, in pvariance
    raise StatisticsError('pvariance requires at least one data point')
statistics.StatisticsError: pvariance requires at least one data point


应用:
总体方差的应用与样本方差非常相似,尽管总体方差的范围远大于样本方差。总体方差仅在要计算整个总体的方差时使用,否则对于计算样本的方差, variance()是首选。人口方差是统计和处理大量数据的一个非常重要的工具。就像,当全知均值未知(样本均值)时,方差被用作有偏估计量。