如何计算 Pandas DataFrame 中的 MOVING AVERAGE?
在本文中,我们将研究如何计算 pandas DataFrame 中的移动平均线。移动平均线是计算一段时间内数据的平均值。移动平均也称为滚动平均,是通过对 k 个时间段内的时间序列数据进行平均计算得出的。
移动平均线分为三种类型:
- 简单移动平均线 (SMA)
- 指数移动平均线 (EMA)
- 累积移动平均线(CMA)
所用数据的链接是 RELIANCE.NS_
简单移动平均线 (SMA)
一个简单的移动平均线告诉我们前 K 个数据点的未加权平均值。 K 值越大,曲线越平滑,但增加 K 会降低精度。如果数据点是 p 1, p 2 , . . . , p n然后我们计算简单移动平均线。
在Python中,我们可以使用 .rolling() 方法计算移动平均线。这种方法提供了数据的滚动窗口,我们可以在这些窗口上使用均值函数来计算移动平均值。窗口的大小在函数.rolling(window) 中作为参数传递。
现在让我们看一个示例,说明如何计算 30 天内的简单滚动平均值。
第 1 步:导入库
Python3
# importing Libraries
# importing pandas as pd
import pandas as pd
# importing numpy as np
# for Mathematical calculations
import numpy as np
# importing pyplot from matplotlib as plt
# for plotting graphs
import matplotlib.pyplot as plt
plt.style.use('default')
%matplotlib inline
Python3
# importing time-series data
reliance = pd.read_csv('RELIANCE.NS.csv', index_col='Date',
parse_dates=True)
# Printing dataFrame
reliance.head()
Python3
# updating our dataFrame to have only
# one column 'Close' as rest all columns
# are of no use for us at the moment
# using .to_frame() to convert pandas series
# into dataframe.
reliance = reliance['Close'].to_frame()
# calculating simple moving average
# using .rolling(window).mean() ,
# with window size = 30
reliance['SMA30'] = reliance['Close'].rolling(30).mean()
# removing all the NULL values using
# dropna() method
reliance.dropna(inplace=true)
# printing Dataframe
reliance
Python3
# plotting Close price and simple
# moving average of 30 days using .plot() method
reliance[['Close', 'SMA30']].plot(label='RELIANCE',
figsize=(16, 8))
Python3
# importing Libraries
# importing pandas as pd
import pandas as pd
# importing numpy as np
# for Mathematical calculations
import numpy as np
# importing pyplot from matplotlib as plt
# for plotting graphs
import matplotlib.pyplot as plt
plt.style.use('default')
%matplotlib inline
Python3
# importing time-series data
reliance = pd.read_csv('RELIANCE.NS.csv',
index_col='Date',
parse_dates=True)
# Printing dataFrame
reliance.head()
Python3
# updating our dataFrame to have only
# one column 'Close' as rest all columns
# are of no use for us at the moment
# using .to_frame() to convert pandas series
# into dataframe.
reliance = reliance['Close'].to_frame()
# calculating cumulative moving
# average using .expanding().mean()
reliance['CMA30'] = reliance['Close'].expanding().mean()
# printing Dataframe
reliance
Python3
# plotting Close price and cumulative moving
# average of 30 days using .plot() method
reliance[['Close', 'CMA30']].plot(label='RELIANCE',
figsize=(16, 8))
Python3
# importing Libraries
# importing pandas as pd
import pandas as pd
# importing numpy as np
# for Mathematical calculations
import numpy as np
# importing pyplot from matplotlib as plt
# for plotting graphs
import matplotlib.pyplot as plt
plt.style.use('default')
%matplotlib inline
Python3
# importing time-series data
reliance = pd.read_csv('RELIANCE.NS.csv',
index_col='Date',
parse_dates=True)
# Printing dataFrame
reliance.head()
Python3
# updating our dataFrame to have only
# one column 'Close' as rest all columns
# are of no use for us at the moment
# using .to_frame() to convert pandas
# series into dataframe.
reliance = reliance['Close'].to_frame()
# calculating exponential moving average
# using .ewm(span).mean() , with window size = 30
reliance['EWMA30'] = reliance['Close'].ewm(span=30).mean()
# printing Dataframe
reliance
Python3
# plotting Close price and exponential
# moving averages of 30 days
# using .plot() method
reliance[['Close', 'EWMA30']].plot(label='RELIANCE',
figsize=(16, 8))
第 2 步:导入数据
要导入数据,我们将使用 pandas .read_csv()函数。
Python3
# importing time-series data
reliance = pd.read_csv('RELIANCE.NS.csv', index_col='Date',
parse_dates=True)
# Printing dataFrame
reliance.head()
输出:
第 3 步:计算简单移动平均线
为了在Python中计算 SMA,我们将使用 Pandas dataframe.rolling()函数,该函数帮助我们在滚动窗口上进行计算。在滚动窗口上,我们将使用 .mean()函数计算每个窗口的平均值。
Syntax: DataFrame.rolling(window, min_periods=None, center=False, win_type=None, on=None, axis=0).mean()
Parameters :
- window : Size of the window. That is how many observations we have to take for the calculation of each window.
- min_periods : Least number of observations in a window required to have a value (otherwise result is NA).
- center : It is used to set the labels at the center of the window.
- win_type : It is used to set the window type.
- on : Datetime column of our dataframe on which we have to calculate rolling mean.
- axis : integer or string, default 0
Python3
# updating our dataFrame to have only
# one column 'Close' as rest all columns
# are of no use for us at the moment
# using .to_frame() to convert pandas series
# into dataframe.
reliance = reliance['Close'].to_frame()
# calculating simple moving average
# using .rolling(window).mean() ,
# with window size = 30
reliance['SMA30'] = reliance['Close'].rolling(30).mean()
# removing all the NULL values using
# dropna() method
reliance.dropna(inplace=true)
# printing Dataframe
reliance
输出:
第 4 步:绘制简单移动平均线
Python3
# plotting Close price and simple
# moving average of 30 days using .plot() method
reliance[['Close', 'SMA30']].plot(label='RELIANCE',
figsize=(16, 8))
输出:
累积移动平均线 (CMA)
累积移动平均线是所有先前值的平均值,直到当前值。 dataPoints x 1 , x 2 ..... 在时间 t 的 CMA 可以计算为,
在计算 CMA 时,我们没有任何固定的窗口大小。随着时间的推移,窗口的大小不断增加。在Python中,我们可以使用 .expanding() 方法计算 CMA。现在我们将看一个示例,计算 30 天的 CMA。
第 1 步:导入库
Python3
# importing Libraries
# importing pandas as pd
import pandas as pd
# importing numpy as np
# for Mathematical calculations
import numpy as np
# importing pyplot from matplotlib as plt
# for plotting graphs
import matplotlib.pyplot as plt
plt.style.use('default')
%matplotlib inline
第 2 步:导入数据
要导入数据,我们将使用 pandas .read_csv()函数。
Python3
# importing time-series data
reliance = pd.read_csv('RELIANCE.NS.csv',
index_col='Date',
parse_dates=True)
# Printing dataFrame
reliance.head()
第 3 步:计算累积移动平均线
要在Python中计算 CMA,我们将使用dataframe.expanding()函数。该方法为我们提供了聚合函数的累积值(在这种情况下为平均值)。
Syntax: DataFrame.expanding(min_periods=1, center=None, axis=0, method=’single’).mean()
Parameters:
- min_periods : int, default 1 . Least number of observations in a window required to have a value (otherwise result is NA).
- center : bool, default False . It is used to set the labels at the center of the window.
- axis : int or str, default 0
- method : str {‘single’, ‘table’}, default ‘single’
Python3
# updating our dataFrame to have only
# one column 'Close' as rest all columns
# are of no use for us at the moment
# using .to_frame() to convert pandas series
# into dataframe.
reliance = reliance['Close'].to_frame()
# calculating cumulative moving
# average using .expanding().mean()
reliance['CMA30'] = reliance['Close'].expanding().mean()
# printing Dataframe
reliance
输出:
第 4 步:绘制累积移动平均线
Python3
# plotting Close price and cumulative moving
# average of 30 days using .plot() method
reliance[['Close', 'CMA30']].plot(label='RELIANCE',
figsize=(16, 8))
输出:
指数移动平均线 (EMA):
指数移动平均线 (EMA) 告诉我们前 K 个数据点的加权平均值。 EMA 对最近的数据点赋予了更大的权重和重要性。在时间段 t 计算 EMA 的公式是:
其中 x t是时间 t 的观察值,α 是平滑因子。在Python中,EMA 是使用 .ewm() 方法计算的。我们可以将 span 或 window 作为参数传递给 .ewm(span = ) 方法。
现在我们将看一个示例来计算 30 天的 EMA。
第 1 步:导入库
Python3
# importing Libraries
# importing pandas as pd
import pandas as pd
# importing numpy as np
# for Mathematical calculations
import numpy as np
# importing pyplot from matplotlib as plt
# for plotting graphs
import matplotlib.pyplot as plt
plt.style.use('default')
%matplotlib inline
第 2 步:导入数据
要导入数据,我们将使用 pandas .read_csv()函数。
Python3
# importing time-series data
reliance = pd.read_csv('RELIANCE.NS.csv',
index_col='Date',
parse_dates=True)
# Printing dataFrame
reliance.head()
输出:
第 3 步:计算指数移动平均线
要在Python中计算 EMA,我们使用 dataframe.ewm()函数。它为我们提供了指数加权函数。我们将使用 .mean()函数来计算 EMA。
Syntax: DataFrame.ewm(com=None, span=None, halflife=None, alpha=None, min_periods=0, adjust=True, ignore_na=False, axis=0, times=None).mean()
Parameters:
- com : float, optional . It is the decay in terms of centre of mass.
- span : float, optional . It is the decay in terms of span.
- halflife : float, str, timedelta, optional . It is the decay in terms of halflife.
- alpha : float, optional . It is the smoothing factor having value between 0 and 1 , 1 inclusive .
- min_periods : int, default 0. Least number of observations in a window required to have a value (otherwise result is NA).
- adjust : bool, default True . Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings (viewing EWMA as a moving average)
- ignore_na : Ignore missing values when calculating weights; specify True to reproduce pre-0.15.0 behavior.
- axis : The axis to use. The value 0 identifies the rows, and 1 identifies the columns.
Python3
# updating our dataFrame to have only
# one column 'Close' as rest all columns
# are of no use for us at the moment
# using .to_frame() to convert pandas
# series into dataframe.
reliance = reliance['Close'].to_frame()
# calculating exponential moving average
# using .ewm(span).mean() , with window size = 30
reliance['EWMA30'] = reliance['Close'].ewm(span=30).mean()
# printing Dataframe
reliance
输出:
第 4 步:绘制指数移动平均线
Python3
# plotting Close price and exponential
# moving averages of 30 days
# using .plot() method
reliance[['Close', 'EWMA30']].plot(label='RELIANCE',
figsize=(16, 8))
输出: