Python统计 |方差()
先决条件: Python统计 |方差()
pvariance()函数有助于计算整体的方差,而不是样本的方差。 Variation() 和 pvariance() 之间的唯一区别是,在使用 variance() 时,只考虑样本均值,而在 pvariance() 期间,则考虑整个总体的均值。
总体方差类似于样本方差,它说明了特定总体中的数据点是如何分布的。它是从数据点到数据集平均值的距离的平均值,平方。总体方差是总体参数,不依赖于研究方法或抽样实践。
Syntax : pvariance( [data], mu)
Parameters :
[data] : An iterable with real valued numbers.
mu (optional): Takes actual mean of data-set/ population as value.
Returnype : Returns the actual population variance of the values passed as parameter.
Exceptions :
StatisticsError is raised for data-set less than 2-values passed as parameter.
Impossible values when the value provided as mu doesn’t match actual mean of the data-set.
代码#1:
Python3
# Python code to demonstrate the
# use of pvariance()
# importing statistics module
import statistics
# creating a random population list
population = (1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.9, 2.2,
2.3, 2.4, 2.6, 2.9, 3.0, 3.4, 3.3, 3.2)
# Prints the population variance
print("Population variance is %s"
%(statistics.pvariance(population)))
Python3
# Python code to demonstrate pvariance()
# on various range of population sets
# importing statistics module
from statistics import pvariance
# importing fractions module as F
from fractions import Fraction as F
# Population tree for a set of positive integers
pop1 = (1, 2, 3, 5, 4, 6, 1, 2, 2, 3, 1, 3,
7, 8, 9, 1, 1, 1, 2, 6, 7, 8, 9, )
# Creating a population tree for
# a set of negative integers
pop2 = (-36, -35, -34, -32, -30, -31, -33, -33, -33,
-38, -36, -35, -34, -38, -40, -31, -32)
# Creating a population tree for
# a set of fractional numbers
pop3 = (F(1, 3), F(2, 4), F(2, 3),
F(3, 2), F(2, 5), F(2, 2),
F(1, 1), F(1, 4), F(1, 2), F(2, 1))
# Creating a population tree for
# a set of decimal values
pop4 = (3.45, 3.2, 2.5, 4.6, 5.66, 6.43,
4.32, 4.23, 6.65, 7.87, 9.87, 1.23,
1.00, 1.45, 10.12, 12.22, 19.88)
# Print the population variance for
# the created population trees
print("Population variance of set 1 is % s"
%(pvariance(pop1)))
print("Population variance of set 2 is % s"
%(pvariance(pop2)))
print("Population variance of set 3 is % s"
%(pvariance(pop3)))
print("Population variance of set 4 is % s"
%(pvariance(pop4)))
Python3
# Python code to demonstrate the use
# of 'mu' parameter on pvariance()
# importing statistics module
import statistics
# Apparently, the Python interpreter doesn't
# even check whether the value entered for mu
# is the actual mean of data-set or not.
# Thus providing incorrect value would
# lead to impossible answers
# Creating a population tree of the
# age of kids in a locality
tree = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 12, 12, 12, 13, 1, 2, 12, 2, 2,
2, 3, 4, 5, 5, 5, 5, 6, 6, 6)
# Finding the mean of population tree
m = statistics.mean(tree)
# Using the mu parameter
# while using pvariance()
print("Population Variance is % s"
%(statistics.pvariance(tree, mu = m)))
Python3
# Python code to demonstrate the
# difference between pvariance()
# and variance()
# importing statistocs module
import statistics
# Population tree and extract
# a sample from it
tree = (1.1, 1.22, .23, .55, .67, 2.33, 2.81,
1.54, 1.2, 0.2, 0.1, 1.22, 1.61)
# Sample extract from population tree
sample = (1.22, .23, .55, .67, 2.33,
2.81, 1.54, 1.2, 0.2)
# Print sample variance and as
# well as population variance
print ("Variance of whole popuation is %s"
%(statistics.pvariance(tree)))
print ("Variance of sample from population is %s "
% (statistics.variance(sample)))
# Print the difference in both population
# variance and sample variance
print("\n")
print("Difference in Population variance"
"and Sample variance is % s"
%(abs(statistics.pvariance(tree)
- statistics.variance(sample))))
Python3
# Python code to demonstrate StatisticsError
# importing statistics module
import statistics
# creating an empty population set
pop = ()
# will raise StatisticsError
print(statistics.pvariance(pop))
输出 :
Population variance is 0.6658984375
代码 #2:在不同范围的人口树上演示 pvariance()。
Python3
# Python code to demonstrate pvariance()
# on various range of population sets
# importing statistics module
from statistics import pvariance
# importing fractions module as F
from fractions import Fraction as F
# Population tree for a set of positive integers
pop1 = (1, 2, 3, 5, 4, 6, 1, 2, 2, 3, 1, 3,
7, 8, 9, 1, 1, 1, 2, 6, 7, 8, 9, )
# Creating a population tree for
# a set of negative integers
pop2 = (-36, -35, -34, -32, -30, -31, -33, -33, -33,
-38, -36, -35, -34, -38, -40, -31, -32)
# Creating a population tree for
# a set of fractional numbers
pop3 = (F(1, 3), F(2, 4), F(2, 3),
F(3, 2), F(2, 5), F(2, 2),
F(1, 1), F(1, 4), F(1, 2), F(2, 1))
# Creating a population tree for
# a set of decimal values
pop4 = (3.45, 3.2, 2.5, 4.6, 5.66, 6.43,
4.32, 4.23, 6.65, 7.87, 9.87, 1.23,
1.00, 1.45, 10.12, 12.22, 19.88)
# Print the population variance for
# the created population trees
print("Population variance of set 1 is % s"
%(pvariance(pop1)))
print("Population variance of set 2 is % s"
%(pvariance(pop2)))
print("Population variance of set 3 is % s"
%(pvariance(pop3)))
print("Population variance of set 4 is % s"
%(pvariance(pop4)))
输出 :
Population variance of set 1 is 7.913043478260869
Population variance of set 2 is 7.204152249134948
Population variance of set 3 is 103889/360000
Population variance of set 4 is 21.767923875432526
代码#3:演示mu参数的使用。
Python3
# Python code to demonstrate the use
# of 'mu' parameter on pvariance()
# importing statistics module
import statistics
# Apparently, the Python interpreter doesn't
# even check whether the value entered for mu
# is the actual mean of data-set or not.
# Thus providing incorrect value would
# lead to impossible answers
# Creating a population tree of the
# age of kids in a locality
tree = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 12, 12, 12, 13, 1, 2, 12, 2, 2,
2, 3, 4, 5, 5, 5, 5, 6, 6, 6)
# Finding the mean of population tree
m = statistics.mean(tree)
# Using the mu parameter
# while using pvariance()
print("Population Variance is % s"
%(statistics.pvariance(tree, mu = m)))
输出 :
Population Variance is 14.30385015608741
代码 #4:演示 pvariance() 和 variance() 之间的区别
Python3
# Python code to demonstrate the
# difference between pvariance()
# and variance()
# importing statistocs module
import statistics
# Population tree and extract
# a sample from it
tree = (1.1, 1.22, .23, .55, .67, 2.33, 2.81,
1.54, 1.2, 0.2, 0.1, 1.22, 1.61)
# Sample extract from population tree
sample = (1.22, .23, .55, .67, 2.33,
2.81, 1.54, 1.2, 0.2)
# Print sample variance and as
# well as population variance
print ("Variance of whole popuation is %s"
%(statistics.pvariance(tree)))
print ("Variance of sample from population is %s "
% (statistics.variance(sample)))
# Print the difference in both population
# variance and sample variance
print("\n")
print("Difference in Population variance"
"and Sample variance is % s"
%(abs(statistics.pvariance(tree)
- statistics.variance(sample))))
输出 :
Variance of the whole popuation is 0.6127751479289941
Variance of the sample from population is 0.8286277777777779
Difference in Population variance and Sample variance is 0.21585262984878373
注意:我们可以从上面的示例中看到,Population Variance 和 Sample Variance 并没有太大的差异。代码 #5:展示 StatisticsError
Python3
# Python code to demonstrate StatisticsError
# importing statistics module
import statistics
# creating an empty population set
pop = ()
# will raise StatisticsError
print(statistics.pvariance(pop))
输出 :
Traceback (most recent call last):
File "/home/fa112e1405f09970eeddd48214318a3c.py", line 10, in
print(statistics.pvariance(pop))
File "/usr/lib/python3.5/statistics.py", line 603, in pvariance
raise StatisticsError('pvariance requires at least one data point')
statistics.StatisticsError: pvariance requires at least one data point
应用:
总体方差的应用与样本方差非常相似,尽管总体方差的范围远大于样本方差。总体方差仅在要计算整个总体的方差时使用,否则对于计算样本的方差, variance()是首选。人口方差是统计和处理大量数据的一个非常重要的工具。就像,当全知均值未知(样本均值)时,方差被用作有偏估计量。