📜  Python|熊猫 dataframe.resample()

📅  最后修改于: 2022-05-13 01:54:56.075000             🧑  作者: Mango

Python|熊猫 dataframe.resample()

Python是一种用于进行数据分析的出色语言,主要是因为以数据为中心的Python包的奇妙生态系统。 Pandas就是其中之一,它使导入和分析数据变得更加容易。

Pandas dataframe.resample()函数主要用于时间序列数据。
时间序列是按时间顺序索引(或列出或绘制)的一系列数据点。最常见的是,时间序列是在连续的等间隔时间点采取的序列。它是一种对时间序列进行频率转换和重采样的便捷方法。对象必须具有类似日期时间的索引(DatetimeIndex、PeriodIndex 或 TimedeltaIndex),或者将类似日期时间的值传递给 on 或 level 关键字。

重采样根据实际数据生成唯一的采样分布。我们可以应用各种频率来重新采样我们的时间序列数据。这是分析领域中一项非常重要的技术。
最常用的时间序列频率是——
W :每周频率
M :月末频率
SM :半月结束频率(15 日和月末)
Q :四分之一结束频率

还有许多其他类型的时间序列频率可用。让我们看看如何将这些时间序列频率应用于数据并重新采样。

有关代码中使用的 CSV 文件的链接,请单击此处

这是苹果公司从 (13-11-17) 到 (13-11-18) 为期 1 年的股价数据

示例 #1:按月频率重新采样数据

# importing pandas as pd
import pandas as pd
  
# By default the "date" column was in string format,
# we need to convert it into date-time format
  
# parse_dates =["date"], converts the "date" 
# column to date-time format. We know that 
# resampling works with time-series data only
# so convert "date" column to index
  
# index_col ="date", makes "date" column, the index of the data frame
df = pd.read_csv("apple.csv", parse_dates =["date"], index_col ="date")
  
# Printing the first 10 rows of dataframe
df[:10]

# Resampling the time series data based on months
# we apply it on stock close price
# 'M' indicates month
monthly_resampled_data = df.close.resample('M').mean()
  
# the above command will find the mean closing price
# of each month for a duration of 12 months.
monthly_resampled_data

输出 :
示例 #2:按周频率重新采样数据

# importing pandas as pd
import pandas as pd
  
# We know that resampling works with time-series data
# only so convert "date" column to index
# index_col ="date", makes "date" column.
  
df = pd.read_csv("apple.csv", parse_dates =["date"], index_col ="date")
  
# Resampling the time series data based on weekly frequency
# we apply it on stock open price 'W' indicates week
weekly_resampled_data = df.open.resample('W').mean()
  
# find the mean opening price of each week 
# for each week over a period of 1 year.
weekly_resampled_data

输出 :
示例 #3:按季度频率重新采样数据

# importing pandas as pd
import pandas as pd
  
# We know that resampling works with time-series
#  data only so convert our "date" column to index
# index_col ="date", makes "date" column
df = pd.read_csv("apple.csv", parse_dates =["date"], index_col ="date")
  
# Resampling the time series data
#  based on Quarterly frequency
# 'Q' indicates quarter
  
Quarterly_resampled_data = df.open.resample('Q').mean()
  
# mean opening price of each quarter
# over a period of 1 year.
Quarterly_resampled_data

输出 :