sciPy stats.binned_statistic()函数| Python
stats.binned_statistic(x, values, statistic='mean', bins=10, range=None)
函数计算给定数据(数组元素)的分箱统计值。
它的工作原理类似于直方图函数。由于直方图函数制作垃圾箱并计算编号。每个 bin 中的点数;此函数计算每个 bin 的值的总和、平均值、中值、计数或其他统计信息。
Parameters :
arr : [array_like]input array to be binned.
values : [array_like]on which stats to be calculated.
statistics : Statistics to compute {mean, count, median, sum, function}. Default is mean.
bin : [int or scalars]If bins is an int, it defines the number of equal-width bins in the given range (10, by default). If bins is a sequence, it defines the bin edges.
range : (float, float) Lower and upper range of the bins and if not provided, range is from x.max() to x.min().
Results : Statistics value for each bin; bin edges; bin number.
代码#1:
# stats.binned_statistic() method
import numpy as np
from scipy import stats
# 1D array
arr = [20, 2, 7, 1, 34]
print("\narr : \n", arr)
# median
print("\nbinned_statistic for median : \n", stats.binned_statistic(
arr, np.arange(5), statistic ='median', bins = 4))
输出 :
arr :
[20, 2, 7, 1, 34]
binned_statistic for median :
BinnedStatisticResult(statistic=array([ 2., nan, 0., 4.]),
bin_edges=array([ 1., 9.25, 17.5, 25.75, 34. ]),
binnumber=array([3, 1, 1, 1, 4], dtype=int64))
代码#2:
# stats.binned_statistic() method
import numpy as np
from scipy import stats
# mean
arr = [20, 2, 7, 1, 34]
print("\nbinned_statistic for mean : \n", stats.binned_statistic(
arr, np.arange(5), statistic ='mean', bins = 2))
输出 :
binned_statistic for mean :
BinnedStatisticResult(statistic=array([2., 2.]),
bin_edges=array([ 1., 17.5, 34. ]),
binnumber=array([2, 1, 1, 1, 2], dtype=int64))