pandas 根据日期类型拆分数据 - Python (1)

📌 相关文章

📜 pandas 根据日期类型拆分数据 - Python (1)

📅 最后修改于: 2023-12-03 15:18:15.237000 🧑 作者: Mango

Pandas 根据日期类型拆分数据 - Python

在数据分析过程中，经常需要根据日期类型对数据进行拆分和处理。Pandas 提供了强大的日期类型操作功能，可以方便快捷地对日期数据进行处理。本文将介绍如何在 Python 中使用 Pandas 根据日期类型拆分数据。

准备工作

在开始之前，我们需要准备一些数据。下面是一个示例数据：

import pandas as pd

data = {'date': ['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04', '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08', '2022-01-09', '2022-01-10'],
        'value': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]}
df = pd.DataFrame(data)

按年份拆分数据

我们可以使用 Pandas 中的 groupby 函数按年份拆分数据。具体步骤如下：

将 date 列转换为 Pandas 中的日期类型，以便后续操作：
```
df['date'] = pd.to_datetime(df['date'])
```

将日期按年份进行分组：

grouped = df.groupby(df['date'].dt.year)

遍历分组结果，进行各种处理，例如求平均值：

for key, group in grouped:
    print('Year:', key)
    print('Mean:', group.mean())
    print('Max:', group.max())
    print('Min:', group.min())

完整代码如下：

import pandas as pd

data = {'date': ['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04', '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08', '2022-01-09', '2022-01-10'],
        'value': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]}
df = pd.DataFrame(data)

df['date'] = pd.to_datetime(df['date'])
grouped = df.groupby(df['date'].dt.year)

for key, group in grouped:
    print('Year:', key)
    print('Mean:', group.mean())
    print('Max:', group.max())
    print('Min:', group.min())

以上代码将按照年份拆分数据，并输出每年的平均值、最大值和最小值。

按月份拆分数据

我们可以使用 Pandas 中的 groupby 函数按月份拆分数据。具体步骤如下：

将 date 列转换为 Pandas 中的日期类型，以便后续操作：
```
df['date'] = pd.to_datetime(df['date'])
```

将日期按月份进行分组：

grouped = df.groupby(df['date'].dt.month)

遍历分组结果，进行各种处理，例如求平均值：

for key, group in grouped:
    print('Month:', key)
    print('Mean:', group.mean())
    print('Max:', group.max())
    print('Min:', group.min())

完整代码如下：

import pandas as pd

data = {'date': ['2022-01-01', '2022-01-02', '2022-01-03', '2022-02-01', '2022-02-02', '2022-02-03', '2022-03-01', '2022-03-02', '2022-03-03'],
        'value': [10, 20, 30, 40, 50, 60, 70, 80, 90]}
df = pd.DataFrame(data)

df['date'] = pd.to_datetime(df['date'])
grouped = df.groupby(df['date'].dt.month)

for key, group in grouped:
    print('Month:', key)
    print('Mean:', group.mean())
    print('Max:', group.max())
    print('Min:', group.min())

以上代码将按照月份拆分数据，并输出每月的平均值、最大值和最小值。

按周拆分数据

我们可以使用 Pandas 中的 groupby 函数按周拆分数据。具体步骤如下：

将 date 列转换为 Pandas 中的日期类型，以便后续操作：
```
df['date'] = pd.to_datetime(df['date'])
```

将日期按周进行分组：

grouped = df.groupby(df['date'].dt.week)

遍历分组结果，进行各种处理，例如求平均值：

for key, group in grouped:
    print('Week:', key)
    print('Mean:', group.mean())
    print('Max:', group.max())
    print('Min:', group.min())

完整代码如下：

import pandas as pd

data = {'date': ['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04', '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08', '2022-01-09', '2022-01-10'],
        'value': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]}
df = pd.DataFrame(data)

df['date'] = pd.to_datetime(df['date'])
grouped = df.groupby(df['date'].dt.week)

for key, group in grouped:
    print('Week:', key)
    print('Mean:', group.mean())
    print('Max:', group.max())
    print('Min:', group.min())

以上代码将按照周拆分数据，并输出每周的平均值、最大值和最小值。

总结

使用 Pandas 根据日期类型拆分数据非常方便，只需几行代码即可实现。通过本文的介绍，相信大家已经掌握了相关技巧，在实际工作中可以灵活应用。