📅  最后修改于: 2023-12-03 14:45:02.900000             🧑  作者: Mango
Pandas is a popular library used in data analysis tasks in Python. One of the common tasks is to sum up data based on certain criteria. With Pandas, we can easily achieve this using the sum()
function.
DataFrame.sum(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
The sum()
function takes several parameters.
axis
: It specifies the axis along which the sum needs to be computed. By default, it is set to None which means it sums up all the values in the DataFrame.
skipna
: It specifies whether the NaN (Not a Number) values need to be skipped while computing the sum. By default, it is set to True which means NaN values are excluded.
level
: It specifies the level in the MultiIndex hierarchy for which the sum needs to be computed.
numeric_only
: It specifies whether to include only numeric data in the computation. By default, it is set to None which means all data is included.
Let us consider a sample DataFrame to understand how the sum()
function works.
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'Salary': [50000, 60000, 70000, 80000]
})
This DataFrame contains information about four employees. We can compute the sum of their salaries as follows:
total_salary = df['Salary'].sum()
print(total_salary)
Output:
260000
We can also compute the sum of each row using the sum()
function and specifying the axis as 1.
total = df[['Age', 'Salary']].sum(axis=1)
print(total)
Output:
0 50025
1 60030
2 70035
3 80040
dtype: int64
In this example, we have computed the sum of the Age and Salary columns for each row.
Pandas sum()
function is a powerful tool to compute sums in a DataFrame in Python. By specifying parameters such as axis, skipna, and level, we can customize the computation based on our requirements.