Python统计模块中的 median_grouped()函数
谈到统计函数,数据集的中位数是稳健集中趋势的度量,它受数据中异常值的影响较小。如前所述,未分组数据集的中位数使用median() 、 median_high() 、 median_low()函数。
Python还提供了计算分组和连续数据函数的中位数的选项,这是这种强大且方便的语言的最佳部分。统计模块下的median_grouped()函数,有助于从一组连续数据中计算中值。
假设数据被分组为宽度间隔的间隔。数组中的每个数据点都是包含真值的区间的中点。中值是通过在中值区间(包含中值的区间)内插值计算的,假设该区间内的真实值均匀分布:
median = L + interval * (N / 2 - CF) / FL = lower limit of the median interval
N = total number of data points
CF = number of data points below the median interval
F = number of data points in the median interval
Syntax : median_grouped( [data-set], interval)
Parameters :
[data-set] : List or tuple or an iterable with a set of numeric values.
interval (1 by default) : Determines the width of grouped data and changing. It will also change the interpolation of calculated median.
Returntype : Return the median of grouped continuous data, calculated as 50th perecentile.
Exceptions : StatisticsError is raised when iterable passed is empty or when list is null.
代码#1:
Python3
# Python3 code to demonstrate median_grouped()
# importing median_grouped from
# the statistics module
from statistics import median_grouped
# creating an simple data-set
data1 = [15, 20, 25, 30, 35]
# printing median_grouped for the set
print("Grouped Median of the median is %s"
%(median_grouped(data1)))
Python3
# Python code to demonstrate the
# working of median_grouped()
# importing statistics module
from statistics import median_grouped
# tuple of a set of positive integers
set1 = [2, 5, 3, 4, 8, 9]
# tuple of a set of negative integers
set2 = [-6, -2, -9, -12]
# tuple of a set of positive
# and negative integers
set3 = [2, 4, 8, 9, -2, -3, -5, -6]
# Printing grouped median for
# the given set of data
print("Grouped Median of set 1 is % s" % (median_grouped(set1)))
print("Grouped Median of set 2 is % s" % (median_grouped(set2)))
print("Grouped Median of set 3 is % s" % (median_grouped(set3)))
Python3
# Python code to demonstrate the working of
# interval in median_grouped() function
# importing statistics module
from statistics import median_grouped
# creating a tuple of simple data
set1 = (10, 12, 13, 12, 13, 15)
# Printing median_grouped()
# keeping default interval at 1
print("Grouped Median for Interval set as "\
"(default) 1 is % s" %(median_grouped(set1)))
# For interval value of 2
print("Grouped Median for Interval set as "\
"2 is % s" %(median_grouped(set1, interval = 2)))
# Now for interval value of 5
print("Grouped Median for Interval set as "\
"5 is % s" %(median_grouped(set1, interval = 5)))
Python3
# Python code to demonstrate StatisticsError
# importing the statistics module
import statistics
# creating an empty dataset
list1 = []
# Will raise StatisticsError
print(statistics.median_grouped(list1))
输出 :
Grouped Median of the median is 25.0
代码 #2 :对一系列不同数据进行中位数分组的工作
Python3
# Python code to demonstrate the
# working of median_grouped()
# importing statistics module
from statistics import median_grouped
# tuple of a set of positive integers
set1 = [2, 5, 3, 4, 8, 9]
# tuple of a set of negative integers
set2 = [-6, -2, -9, -12]
# tuple of a set of positive
# and negative integers
set3 = [2, 4, 8, 9, -2, -3, -5, -6]
# Printing grouped median for
# the given set of data
print("Grouped Median of set 1 is % s" % (median_grouped(set1)))
print("Grouped Median of set 2 is % s" % (median_grouped(set2)))
print("Grouped Median of set 3 is % s" % (median_grouped(set3)))
输出 :
Grouped Median of set 1 is 4.5
Grouped Median of set 2 is -6.5
Grouped Median of set 3 is 1.5
代码#3:区间工作
Python3
# Python code to demonstrate the working of
# interval in median_grouped() function
# importing statistics module
from statistics import median_grouped
# creating a tuple of simple data
set1 = (10, 12, 13, 12, 13, 15)
# Printing median_grouped()
# keeping default interval at 1
print("Grouped Median for Interval set as "\
"(default) 1 is % s" %(median_grouped(set1)))
# For interval value of 2
print("Grouped Median for Interval set as "\
"2 is % s" %(median_grouped(set1, interval = 2)))
# Now for interval value of 5
print("Grouped Median for Interval set as "\
"5 is % s" %(median_grouped(set1, interval = 5)))
输出 :
Grouped Median for Interval set as (default) 1 is 12.5
Grouped Median for Interval set as 2 is 12.0
Grouped Median for Interval set as 5 is 10.5
Grouped Median for Interval set as 10 is 8.0
注意:观察一个模式,随着区间值的增加,中值减小。代码 #4:展示 StatisticsError
Python3
# Python code to demonstrate StatisticsError
# importing the statistics module
import statistics
# creating an empty dataset
list1 = []
# Will raise StatisticsError
print(statistics.median_grouped(list1))
输出 :
Traceback (most recent call last):
File "/home/0990a4a3f5206c7cd12a596cf82a1587.py", line 10, in
print(statistics.median_grouped(list1))
File "/usr/lib/python3.5/statistics.py", line 431, in median_grouped
raise StatisticsError("no median for empty data")
statistics.StatisticsError: no median for empty data
应用:
分组中位数与中位数具有所有相同的应用。它通常用于涉及大量数据的计算,例如银行和金融。它是统计的重要组成部分,是数据计算中最强大的工具。