📅  最后修改于: 2023-12-03 14:45:02.503000             🧑  作者: Mango
Pandas DataFrame MultiIndex is a powerful way to work with data that has multiple levels of indexes. It allows us to manipulate and analyze data in a structured and organized way. MultiIndex is especially useful when working with time-series data, financial data, or any data that has a hierarchical structure.
We can create a MultiIndex DataFrame by passing a list of index levels to the index
parameter of the DataFrame constructor. Here is an example:
import pandas as pd
# Create a DataFrame with MultiIndex
df = pd.DataFrame(
data={"sales": [100, 200, 150, 300, 250, 350],
"expenses": [50, 80, 70, 100, 90, 120]},
index=pd.MultiIndex.from_tuples(
[("Q1", "January"), ("Q1", "February"), ("Q1", "March"),
("Q2", "April"), ("Q2", "May"), ("Q2", "June")],
names=["Quarter", "Month"]
)
)
In this example, we created a DataFrame with two index levels, Quarter
and Month
. The sales
and expenses
columns represent the data for this DataFrame.
To access a specific element in a MultiIndex DataFrame, we can use the .loc
accessor and pass in the values for each level of the index. Here's an example:
# Get the sales for Q1, January
sales_q1_jan = df.loc[("Q1", "January"), "sales"]
print(sales_q1_jan)
Output:
100
We can also use the .loc
accessor to slice out a range of rows and columns based on the index levels. Here's an example:
# Get the sales for Q1
sales_q1 = df.loc["Q1", "sales"]
print(sales_q1)
Output:
Month
January 100
February 200
March 150
Name: sales, dtype: int64
In this example, we returned all the sales values for the Q1
quarter.
We can aggregate data in a MultiIndex DataFrame using the .groupby()
method. Here's an example:
# Get the total sales and expenses for each quarter
quarterly_totals = df.groupby("Quarter").sum()
print(quarterly_totals)
Output:
sales expenses
Quarter
Q1 450 200
Q2 900 310
In this example, we used the .groupby()
method to group the data by Quarter
and then calculated the sum of sales
and expenses
for each group.
Pandas MultiIndex DataFrames provide an organized and structured way to work with hierarchical data. We can easily access and manipulate the data using the .loc
accessor and aggregate the data using the .groupby()
method. Pandas MultiIndex DataFrames are a powerful tool for working with complex data.