Python中的 Matplotlib.pyplot.violinplot()
Matplotlib是Python中用于数组二维图的惊人可视化库。 Matplotlib 是一个基于 NumPy 数组构建的多平台数据可视化库,旨在与更广泛的 SciPy 堆栈配合使用。
Matplotlib.pyplot.violinplot()
顾名思义, matplotlib.pyplot.violinplot()用于制作小提琴图。通过此函数,您可以为数据集的每一列或数据集序列中的每个向量制作小提琴图。所有填充区域扩展以显示整个数据范围,其中可选的线位于平均值、中值、最大值和最小值处。
Syntax: matplotlib.pyplot.violinplot(dataset, positions=None, vert=True, widths=0.5, showmeans=False, showextrema=True, showmedians=False, points=100, bw_method=None, *, data=None)
Parameters:
- dataset: It is a required parameter that is generally an array or a sequence of vectors. This is where the data is fed to the function.
- positions: it is an array-like object whose default value is an array from 1 to n (ie, default = [1, 2, 3…n]). It is used to set the violins position. The limits and ticks are set automatically to match the positions.
- vert: This parameter accepts a boolean value. The default for this parameter is False. If set to True it creates a vertical violin plot else sets a horizontal violin plot.
- widths: It accepts an array-like object and has a default value of 0.5. It is used to set the maximal width of each violin and can be a scalar or a vector. If default value is used it takes about half the horizontal space.
- showmeans: it accepts a boolean value and has the default set as False. if set to true it toggles rendering of the mean
- showextreama: It accepts a boolean value and by default is set to False. if set True, it toggles rendering of the extrema.
- showmedians: It accepts a boolean value and has default set to False. If set True, it toggles the rendering of the medians.
- points:It accepts a scalar and has a default value of 100. it is used to define the total number of points to calculate every gaussian kernel density estimations.
- bw_method: It is an optional parameter that accepts a string, scalar or a callable. The estimator bandwidth is calculated using this method. It can be ‘silverman’, ‘scott’, a callable or a scalar constant. In case of scalar, it is used directly as kde.factor. If it is a callable then it takes GaussianKDE instance only and returns a scalar. Scott is used in case None
Returns: This function returns a dictionary mapping of each component of the violin-plot to a list of respective collection instances. the dictionary returned has the following keys:
- bodies: AN instance list of matplotlib.collections.PolyCollection containing the filled area of every violin.
- cmeans: An instance of matplotlib.collections.LineCollection is created to identify the mean of each violin distribution
- cmins: An instance of matplotlib.collections.LineCollection created to identify the bottom of each violin distribution.
- cmaxes: An instance of matplotlib.collections.LineCollection created to identify the top of each violin distribution.
- cbars:An instance of matplotlib.collections.LineCollection created to identify the center of each violin distribution.
- cmedians:An instance of matplotlib.collections.LineCollection created to identify the mean value of each violin distribution.
示例 1:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(21)
data = np.random.random(111)
quartile1, median, quartile3 = np.percentile(data,
[ 50, 75,100],
axis=0)
plt.violinplot(data)
plt.vlines(1, quartile1,
quartile3,
color='r',
linestyle='--')
plt.hlines(quartile1,.7,1.2)
plt.hlines(quartile3,.7,1.2)
输出:
示例 2:
import matplotlib.pyplot as plt
# Fixing random state for
# reproducibility
np.random.seed(15437660)
# creating randomly generate
# collections / data
coll_1 = np.random.normal(100, 10, 200)
coll_2 = np.random.normal(80, 30, 200)
coll_3 = np.random.normal(90, 20, 200)
coll_4 = np.random.normal(70, 25, 200)
## combining these different
# collections into a list
data_plotter = [coll_1, coll_2,
coll_3, coll_4]
plt.violinplot(data_plotter)
plt.show()
输出: