使用 graph_objects 类在 Plotly 中绘制箱线图
Plotly是一个Python库,用于设计图形,尤其是交互式图形。它可以绘制各种图形和图表,如直方图、条形图、箱线图、散布图等等。它主要用于数据分析和财务分析。 plotly 是一个交互式可视化库。
使用图形对象类的箱线图
如果 Plotly Express 不能提供一个好的起点,那么也可以使用 plotly.graph_objects 中更通用的 go.Box 类。箱线图是一种一致的数据分布方式,基于以下五个主要组成部分:
- 最小值:不包括任何异常值的最低数据点。
- 最大值:不包括任何异常值的最大数据点。
- 中位数(Q2 / 50th Percentile):数据集的中间值。
- 第一个四分位数(Q1 / 25th Percentile):也称为下四分位数 qn(0.25),是数据集下半部分的中位数。
- 第三四分位数(Q3 / 75th Percentile):也称为上四分位数 qn(0.75),是数据集上半部分的中位数。
Syntax: class plotly.graph_objects.Box(arg=None, alignmentgroup=None, boxmean=None, boxpoints=None, customdata=None, customdatasrc=None, dx=None, dy=None, fillcolor=None, hoverinfo=None, hoverinfosrc=None, hoverlabel=None, hoveron=None, hovertemplate=None, hovertemplatesrc=None, hovertext=None, hovertextsrc=None, ids=None, idssrc=None, jitter=None, legendgroup=None, line=None, lowerfence=None, lowerfencesrc=None, marker=None, mean=None, meansrc=None, median=None, mediansrc=None, meta=None, metasrc=None, name=None, notched=None, notchspan=None, notchspansrc=None, notchwidth=None, offsetgroup=None, opacity=None, orientation=None, pointpos=None, q1=None, q1src=None, q3=None, q3src=None, quartilemethod=None, sd=None, sdsrc=None, selected=None, selectedpoints=None, showlegend=None, stream=None, text=None, textsrc=None, uid=None, uirevision=None, unselected=None, upperfence=None, upperfencesrc=None, visible=None, whiskerwidth=None, width=None, x=None, x0=None, xaxis=None, xcalendar=None, xsrc=None, y=None, y0=None, yaxis=None, ycalendar=None, ysrc=None, **kwargs)
Parameters:
x: Sets the x sample data or coordinates. See overview for more info.
y: Sets the y sample data or coordinates. See overview for more info.
hoverinfo: Determines which trace information appear on hover.
marker: Instance or dict with compatible properties.
mean: Sets the mean values.
median: Sets the median values.
例子:
Python3
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_y1= np.random.randint(1,101,100)
random_y2= np.random.randint(1,101,100)
x = ['A', 'B', 'C', 'D']
plot = px.Figure()
plot.add_trace(px.Box(y=random_y1))
plot.add_trace(px.Box(y=random_y2))
plot.show()
Python3
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_x1= np.random.randint(1,101,100)
random_x2= np.random.randint(1,101,100)
x = ['A', 'B', 'C', 'D']
plot = px.Figure()
plot.add_trace(px.Box(x=random_x1))
plot.add_trace(px.Box(x=random_x2))
plot.show()
Python3
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_x1= np.random.randint(1,101,100)
random_x2= np.random.randint(1,101,100)
x = ['A', 'B', 'C', 'D']
plot = px.Figure()
plot.add_trace(px.Box(x=random_x1, marker_color = 'indianred', boxmean=True))
plot.add_trace(px.Box(x=random_x2, marker_color='royalblue', boxmean='sd'))
plot.show()
Python3
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_y= np.random.randint(1,101,100)
x = ['A', 'B', 'C', 'D']
plot = px.Figure()
plot.add_trace(px.Box(y=random_y, quartilemethod="linear", name="linear"))
plot.add_trace(px.Box(y=random_y, quartilemethod="inclusive", name="inclusive"))
plot.add_trace(px.Box(y=random_y, quartilemethod="exclusive", name="exclusive"))
plot.show()
Python3
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_y1= np.random.randint(1,101,100)
random_y2= np.random.randint(1,101,100)
x = ['A', 'B', 'C', 'D']
plot = px.Figure()
plot.add_trace(px.Box(y=random_y1, boxpoints="all"))
plot.add_trace(px.Box(y=random_y2, boxpoints="outliers"))
plot.show()
输出:
水平箱线图
水平箱线图是一种箱线图,其中 x 变量和 y 值在图表中水平显示。它可以通过传递箱形图的 x 参数来创建。
示例:对水平图使用 x 参数
Python3
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_x1= np.random.randint(1,101,100)
random_x2= np.random.randint(1,101,100)
x = ['A', 'B', 'C', 'D']
plot = px.Figure()
plot.add_trace(px.Box(x=random_x1))
plot.add_trace(px.Box(x=random_x2))
plot.show()
输出:
寻找均值和标准差
使用 boxmean 参数可以找到由 boxplot 绘制的数据的平均值和标准差。它可以取两个值——
- 真实的意思
- sd为标准差。
例子:
Python3
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_x1= np.random.randint(1,101,100)
random_x2= np.random.randint(1,101,100)
x = ['A', 'B', 'C', 'D']
plot = px.Figure()
plot.add_trace(px.Box(x=random_x1, marker_color = 'indianred', boxmean=True))
plot.add_trace(px.Box(x=random_x2, marker_color='royalblue', boxmean='sd'))
plot.show()
输出:
更改四分位数的算法
选择四分位数的算法也可以在 plotly 中选择。默认使用线性算法计算。然而,它提供了另外两种算法来做同样的事情,即inclusive和Exclusive 。可以通过传递 quartilemethod 参数来完成。
示例 1:
Python3
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_y= np.random.randint(1,101,100)
x = ['A', 'B', 'C', 'D']
plot = px.Figure()
plot.add_trace(px.Box(y=random_y, quartilemethod="linear", name="linear"))
plot.add_trace(px.Box(y=random_y, quartilemethod="inclusive", name="inclusive"))
plot.add_trace(px.Box(y=random_y, quartilemethod="exclusive", name="exclusive"))
plot.show()
输出:
显示基础数据
可以使用 boxpoints 参数显示基础数据。这个参数的值可以是三种类型——
- 全部为所有点
- 仅异常值的异常值
- 以上都不是假的
例子:
Python3
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_y1= np.random.randint(1,101,100)
random_y2= np.random.randint(1,101,100)
x = ['A', 'B', 'C', 'D']
plot = px.Figure()
plot.add_trace(px.Box(y=random_y1, boxpoints="all"))
plot.add_trace(px.Box(y=random_y2, boxpoints="outliers"))
plot.show()
输出: