📅  最后修改于: 2023-12-03 15:05:07.988000             🧑  作者: Mango
Seaborn is a Python data visualization library based on Matplotlib. One of its most commonly used functions is the distplot
. The distplot
function is used to plot a univariate distribution of observations in a dataset.
Before using the distplot
function, we need to import the necessary libraries. The most commonly used libraries are Seaborn, Matplotlib, NumPy, and Pandas.
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
Let's create a random dataset to plot using the distplot
function.
np.random.seed(123)
data = np.random.normal(size=1000)
distplot
The basic syntax of the distplot
function is as follows:
sns.distplot(a, bins=None, hist=True, kde=True, rug=False, fit=None, hist_kws=None, kde_kws=None, rug_kws=None, fit_kws=None, color=None, vertical=False, norm_hist=False, axlabel=None, label=None, ax=None)
The parameters of the distplot
function are described below.
a
: This is the input data. It can be a Pandas series, NumPy array, list, or even a scalar.
bins
: This parameter sets the number of bins in the histogram.
hist
: This parameter is used to plot the histogram.
kde
: This parameter is used to plot the kernel density estimate.
rug
: This parameter is used to draw a small vertical tick at each observation.
fit
: This parameter is used to fit a parametric distribution to the data.
hist_kws
: This parameter is used to set the aesthetics of the histogram.
kde_kws
: This parameter is used to set the aesthetics of the density plot.
rug_kws
: This parameter is used to set the aesthetics of the rug plot.
fit_kws
: This parameter is used to set the aesthetics of the fitted distribution.
color
: This parameter is used to set the color of the plot.
vertical
: This parameter is used to plot the plot vertically.
norm_hist
: This parameter is used to normalize the histogram.
axlabel
: This parameter is used to set the x-axis label.
label
: This parameter is used to set the label of the plot.
ax
: This parameter is used to specify the axes object.
distplot
Now, let's plot the randomly generated dataset using the distplot
function.
sns.distplot(data)
In the above plot, the histogram and kernel density estimate are plotted. By default, the number of bins in the histogram is chosen automatically.
distplot
We can customize the distplot
function using various parameters. Let's explore some of the most commonly used parameters of the distplot
function.
bins
The bins
parameter is used to set the number of bins in the histogram. Let's set the number of bins to 30.
sns.distplot(data, bins=30)
hist
The hist
parameter is used to plot the histogram. If we set this parameter to False
, only the kernel density estimate will be plotted.
sns.distplot(data, hist=False)
kde
The kde
parameter is used to plot the kernel density estimate. If we set this parameter to False
, only the histogram will be plotted.
sns.distplot(data, kde=False)
rug
The rug
parameter is used to draw a small vertical tick at each observation.
sns.distplot(data, rug=True)
color
The color
parameter is used to set the color of the plot.
sns.distplot(data, color='red')
vertical
The vertical
parameter is used to plot the plot vertically.
sns.distplot(data, vertical=True)
The distplot
function is a powerful tool for visualizing the univariate distribution of data. We can use it to plot histograms, kernel density estimates, and rug plots, and customize it using various parameters.