📅  最后修改于: 2023-12-03 15:33:23.602000             🧑  作者: Mango
Pandas GroupBy is a powerful tool that allows you to group data by one or more columns and apply a function to each group. It is an essential technique for data analysis and is widely used in data science.
grouped = dataframe.groupby(column_name)
dataframe
: The Pandas DataFrame object.column_name
: The name of the column to group the data by.Suppose we have a Pandas DataFrame containing the following data:
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie', 'Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35, 25, 30, 35],
'Salary': [50000, 60000, 70000, 55000, 65000, 75000]}
df = pd.DataFrame(data)
We can group the data by the Name
column and calculate the mean salary for each group using the mean
function:
grouped = df.groupby('Name')
mean_salary = grouped['Salary'].mean()
print(mean_salary)
Output:
Name
Alice 52500.0
Bob 62500.0
Charlie 72500.0
Name: Salary, dtype: float64
In the above example, we grouped the data by the Name
column and applied the mean
function to the Salary
column. The result is a new Series object containing the mean salary for each group.
Pandas GroupBy is a powerful tool for grouping and aggregating data. It allows you to apply functions to each group and can be used in a variety of contexts. If you are working with data in Python, Pandas GroupBy is a must-know technique.