📜  Pandas – 按时间间隔滚动平均值

📅  最后修改于: 2022-05-13 01:55:43.778000             🧑  作者: Mango

Pandas – 按时间间隔滚动平均值

在本文中,我们将研究如何在Python中使用 Pandas 按时间间隔计算数据帧的滚动平均值。

熊猫dataframe.rolling() 是一个帮助我们在滚动窗口上进行计算的函数。换句话说,我们取一个固定大小的窗口并对其进行一些数学计算。

使用的数据集: Tesla_Stock

逐步实施

第 1 步:导入库

Python3
# import pandas as pd
import pandas as pd


Python3
# importing Data
tesla_df = pd.read_csv('Tesla_Stock.csv', index_col='Date', 
                       parse_dates=True)
  
# printing the dataFrame
tesla_df.head(10)


Python3
# Updating the dataFrame with just the 
# column 'Close' as others columns are 
# of no use right now we have used .to_frame
# which converts Series to a DataFrame.
tesla_df = tesla_df['Close'].to_frame()
  
  
# calculating Rolling mean and storing it 
# into a new column of existing dataFrame
# we have set the window as 30 and rest all
# parameters are set to default.
tesla_df['MA30'] = tesla_df['Close'].rolling(30).mean()
  
# Rolling mean is also called as Moving Average ,
# hence we have used the notation MA
# and MA30 is the moving average (rolling mean) 
# of 30 days
  
# printing dataframe
tesla_df


Python3
# calculating Rolling mean and storing it into
# a new column of existing dataFrame we have set
# the window as 200 and rest all parameters are 
# set to default.
tesla_df['MA200'] = tesla_df['Close'].rolling(200).mean()
  
# Rolling mean is also called as Moving Average, hence
# we have used the notation MA and MA200 is the moving
# average (rolling mean) of 200 days
  
# printing dataframe
tesla_df


Python3
# importing matplotlib module
import matplotlib.pyplot as plt
plt.style.use('default')
  
# %matplotlib inline: only draw static
# images in the notebook
%matplotlib inline
  
tesla_df[['Close', 'MA30', 'MA200']].plot(
  label='tesla', figsize=(16, 8))


第 2 步:导入数据

Python3

# importing Data
tesla_df = pd.read_csv('Tesla_Stock.csv', index_col='Date', 
                       parse_dates=True)
  
# printing the dataFrame
tesla_df.head(10)

输出

我们将计算 DataFrame 的“关闭”列的滚动平均值。

第 3 步:计算滚动平均值

Python3

# Updating the dataFrame with just the 
# column 'Close' as others columns are 
# of no use right now we have used .to_frame
# which converts Series to a DataFrame.
tesla_df = tesla_df['Close'].to_frame()
  
  
# calculating Rolling mean and storing it 
# into a new column of existing dataFrame
# we have set the window as 30 and rest all
# parameters are set to default.
tesla_df['MA30'] = tesla_df['Close'].rolling(30).mean()
  
# Rolling mean is also called as Moving Average ,
# hence we have used the notation MA
# and MA30 is the moving average (rolling mean) 
# of 30 days
  
# printing dataframe
tesla_df

输出:

MA30 列的前 29 行的值为 NULL,第一个非 NULL 值位于第 30 行。现在我们将计算窗口为 200 的滚动平均值。

Python3

# calculating Rolling mean and storing it into
# a new column of existing dataFrame we have set
# the window as 200 and rest all parameters are 
# set to default.
tesla_df['MA200'] = tesla_df['Close'].rolling(200).mean()
  
# Rolling mean is also called as Moving Average, hence
# we have used the notation MA and MA200 is the moving
# average (rolling mean) of 200 days
  
# printing dataframe
tesla_df

输出

对于“MA200”,第一个非 NULL 将位于第 200 行。现在让我们绘制“MA30”、“MA200”和“关闭”以获得更好的可视化效果

第 4 步:绘图

Python3

# importing matplotlib module
import matplotlib.pyplot as plt
plt.style.use('default')
  
# %matplotlib inline: only draw static
# images in the notebook
%matplotlib inline
  
tesla_df[['Close', 'MA30', 'MA200']].plot(
  label='tesla', figsize=(16, 8))

输出: