📜  seaborn distplot (1)

📅  最后修改于: 2023-12-03 15:05:07.988000             🧑  作者: Mango

Seaborn Distplot

Seaborn is a Python data visualization library based on Matplotlib. One of its most commonly used functions is the distplot. The distplot function is used to plot a univariate distribution of observations in a dataset.

Importing Seaborn and Other Libraries

Before using the distplot function, we need to import the necessary libraries. The most commonly used libraries are Seaborn, Matplotlib, NumPy, and Pandas.

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
Creating a Random Dataset

Let's create a random dataset to plot using the distplot function.

np.random.seed(123)
data = np.random.normal(size=1000)
Basic Syntax of distplot

The basic syntax of the distplot function is as follows:

sns.distplot(a, bins=None, hist=True, kde=True, rug=False, fit=None, hist_kws=None, kde_kws=None, rug_kws=None, fit_kws=None, color=None, vertical=False, norm_hist=False, axlabel=None, label=None, ax=None)

The parameters of the distplot function are described below.

  1. a: This is the input data. It can be a Pandas series, NumPy array, list, or even a scalar.

  2. bins: This parameter sets the number of bins in the histogram.

  3. hist: This parameter is used to plot the histogram.

  4. kde: This parameter is used to plot the kernel density estimate.

  5. rug: This parameter is used to draw a small vertical tick at each observation.

  6. fit: This parameter is used to fit a parametric distribution to the data.

  7. hist_kws: This parameter is used to set the aesthetics of the histogram.

  8. kde_kws: This parameter is used to set the aesthetics of the density plot.

  9. rug_kws: This parameter is used to set the aesthetics of the rug plot.

  10. fit_kws: This parameter is used to set the aesthetics of the fitted distribution.

  11. color: This parameter is used to set the color of the plot.

  12. vertical: This parameter is used to plot the plot vertically.

  13. norm_hist: This parameter is used to normalize the histogram.

  14. axlabel: This parameter is used to set the x-axis label.

  15. label: This parameter is used to set the label of the plot.

  16. ax: This parameter is used to specify the axes object.

Basic Example of distplot

Now, let's plot the randomly generated dataset using the distplot function.

sns.distplot(data)

In the above plot, the histogram and kernel density estimate are plotted. By default, the number of bins in the histogram is chosen automatically.

Customizing distplot

We can customize the distplot function using various parameters. Let's explore some of the most commonly used parameters of the distplot function.

bins

The bins parameter is used to set the number of bins in the histogram. Let's set the number of bins to 30.

sns.distplot(data, bins=30)

hist

The hist parameter is used to plot the histogram. If we set this parameter to False, only the kernel density estimate will be plotted.

sns.distplot(data, hist=False)

kde

The kde parameter is used to plot the kernel density estimate. If we set this parameter to False, only the histogram will be plotted.

sns.distplot(data, kde=False)

rug

The rug parameter is used to draw a small vertical tick at each observation.

sns.distplot(data, rug=True)

color

The color parameter is used to set the color of the plot.

sns.distplot(data, color='red')

vertical

The vertical parameter is used to plot the plot vertically.

sns.distplot(data, vertical=True)

Conclusion

The distplot function is a powerful tool for visualizing the univariate distribution of data. We can use it to plot histograms, kernel density estimates, and rug plots, and customize it using various parameters.