📜  如何在 Pandas 中以月为单位计算 Timedelta

📅  最后修改于: 2022-05-13 01:54:19.905000             🧑  作者: Mango

如何在 Pandas 中以月为单位计算 Timedelta

两个日期或时间之间的差异表示为 timedelta 对象。持续时间描述两个日期、日期时间或时间出现之间的差异,而增量表示差异的平均值。可以使用 timedelta 估计未来和过去的时间。以月为单位计算的两个日期之间的这种差异称为以月为单位的时间增量。让我们演示几种计算 pandas 中以月为单位的时间增量的方法。

方法一:使用 pandas.Series.dt.to_period()函数计算 Timedelta

pandas.Series.dt.to_period()函数:

在此示例中,我们读取 time.csv 并将每列中的值转换为 DateTime。将列转换为 DateTime 后,我们使用 pandas.Series.dt.to_period() 来计算以月为单位的时间增量。 to_period()函数中的“M”字符串表示月份。返回月末对象。

使用的 CSV:

Python3
# import packages and libraries
import pandas as pd
 
# reading the csv file
data = pd.read_csv('time.csv')
 
# converting columns to datetime
data['start_date'] = pd.to_datetime(data['start_date'])
data['end_date'] = pd.to_datetime(data['end_date'])
 
# calculating time delta in months
data['time_delta_months'] = data['end_date'].dt.to_period('M') - \
    data['start_date'].dt.to_period('M')
print(data)


Python3
# import packages and libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
 
# reading the csv file
data = pd.read_csv('time.csv')
 
# converting columns to datetime
data['start_date'] = pd.to_datetime(data['start_date'])
data['end_date'] = pd.to_datetime(data['end_date'])
 
# calculating time delta in months
data['time_delta_months'] = data['end_date'].dt.to_period('M').astype(int) - \
    data['start_date'].dt.to_period('M').astype(int)
 
# print(data)
print(data)


Python3
# import packages and libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
 
# reading the csv file
data = pd.read_csv('time.csv')
 
# converting columns to datetime
data['start_date'] = pd.to_datetime(data['start_date'])
data['end_date'] = pd.to_datetime(data['end_date'])
 
# calculating time delta in months
data['time_delta_months'] = data['end_date'].dt.to_period('M').view(dtype='int64') -\
    data['start_date'].dt.to_period('M').view(dtype='int64')
 
# print(data)
print(data)


Python3
# import packages and libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
 
# creating a dataframe
data = pd.DataFrame({'startdate': [pd.Timestamp('20181211'),
                                   pd.Timestamp('20180701')],
                     'enddate': [pd.Timestamp('20190612'),
                                 pd.Timestamp('20190712')]})
 
def time_delta_month(end, start):
    return 12 * (end.dt.year - start.dt.year) \
        + (end.dt.month - start.dt.month)
 
print(time_delta_month(data['enddate'], data['startdate']))


输出:

方法 2:使用整数月份计算 Timedelta

在前面的方法中,Monthends 对象被返回。如果我们希望它是整数,我们必须使用 astype()函数或使用 view(dtype='int64') 来转换它。

Python3

# import packages and libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
 
# reading the csv file
data = pd.read_csv('time.csv')
 
# converting columns to datetime
data['start_date'] = pd.to_datetime(data['start_date'])
data['end_date'] = pd.to_datetime(data['end_date'])
 
# calculating time delta in months
data['time_delta_months'] = data['end_date'].dt.to_period('M').astype(int) - \
    data['start_date'].dt.to_period('M').astype(int)
 
# print(data)
print(data)

输出:

示例 2:使用 .view(dtype='int64') 转换为整数

Python3

# import packages and libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
 
# reading the csv file
data = pd.read_csv('time.csv')
 
# converting columns to datetime
data['start_date'] = pd.to_datetime(data['start_date'])
data['end_date'] = pd.to_datetime(data['end_date'])
 
# calculating time delta in months
data['time_delta_months'] = data['end_date'].dt.to_period('M').view(dtype='int64') -\
    data['start_date'].dt.to_period('M').view(dtype='int64')
 
# print(data)
print(data)

输出:

方法三:使用自定义函数计算Timedelta

除了使用内置函数,我们可以使用我们自己的用户定义函数pd.Timestamp()函数将 DateTime-like、str、int 或 float 时间对象转换为时间戳。然后我们从时间戳中提取年份和月份值。由于每年有 12 个月,我们将 12 乘以年差并加上月差。

Python3

# import packages and libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
 
# creating a dataframe
data = pd.DataFrame({'startdate': [pd.Timestamp('20181211'),
                                   pd.Timestamp('20180701')],
                     'enddate': [pd.Timestamp('20190612'),
                                 pd.Timestamp('20190712')]})
 
def time_delta_month(end, start):
    return 12 * (end.dt.year - start.dt.year) \
        + (end.dt.month - start.dt.month)
 
print(time_delta_month(data['enddate'], data['startdate']))


输出:

0     6
1    12
dtype: int64