如何在Python中计算 MAPE？

在本文中，我们将了解如何计算一种方法来确定预测准确性，称为均值。绝对百分比误差（或简称 MAPE ）在Python中也称为平均绝对百分比偏差（MAPD）。 MAPE 项决定了我们的预测给出的准确度如何。 MAPE 中的“M”代表平均值，它采用一系列的平均值，“A”代表绝对值，它使用绝对值来防止正负误差相互抵消，“P”是使这个准确度指标是一个相对指标，“E”代表误差，因为这个指标有助于确定我们预测的误差量。

考虑下面的例子，我们有一个商店的销售信息。日列代表我们所指的天数，实际销售额列代表各天的实际销售额，而预测销售额列代表销售数字的预测值（可能使用 ML 模型）。 APE 列代表绝对百分比误差 (APE)，它表示相应日期的实际值和预测值之间的百分比误差。百分比误差的公式是（实际值 - 预测值）/实际值。 APE 是该百分比误差的正（绝对）值

Day No.	Actual Sales	Forecast Sales	Absolute Percentage Error (APE)
1	136	134	0.014
2	120	124	0.033
3	138	132	0.043
4	155	141	0.090
5	149	149	0.0

现在，可以通过取 APE 值的平均值来找到 MAPE 值。公式可以表示为——

MAPE公式

让我们看看如何在Python中为上述数据集做同样的事情：

Python

# Define the dataset as python lists
actual   = [136, 120, 138, 155, 149]
forecast = [134, 124, 132, 141, 149]
  
# Consider a list APE to store the
# APE value for each of the records in dataset
APE = []
  
# Iterate over the list values
for day in range(5):
  
    # Calculate percentage error
    per_err = (actual[day] - forecast[day]) / actual[day]
  
    # Take absolute value of
    # the percentage error (APE)
    per_err = abs(per_err)
  
    # Append it to the APE list
    APE.append(per_err)
  
# Calculate the MAPE
MAPE = sum(APE)/len(APE)
  
# Print the MAPE value and percentage
print(f'''
MAPE   : { round(MAPE, 2) }
MAPE % : { round(MAPE*100, 2) } %
''')

Python

import pandas as pd
import numpy as np
  
# Define the function to return the MAPE values
def calculate_mape(actual, predicted) -> float:
  
    # Convert actual and predicted
    # to numpy array data type if not already
    if not all([isinstance(actual, np.ndarray),
                isinstance(predicted, np.ndarray)]):
        actual, predicted = np.array(actual), 
        np.array(predicted)
  
    # Calculate the MAPE value and return
    return round(np.mean(np.abs((
      actual - predicted) / actual)) * 100, 2)
  
if __name__ == '__main__':
  
    # CALCULATE MAPE FROM PYTHON LIST
    actual    = [136, 120, 138, 155, 149]
    predicted = [134, 124, 132, 141, 149]
  
    # Get MAPE for python list as parameters
    print("py list  :",
          calculate_mape(actual,
                         predicted), "%")
  
    # CALCULATE MAPE FROM NUMPY ARRAY
    actual    = np.array([136, 120, 138, 155, 149])
    predicted = np.array([134, 124, 132, 141, 149])
  
    # Get MAPE for python list as parameters
    print("np array :", 
          calculate_mape(actual,
                         predicted), "%")
  
    # CALCULATE MAPE FROM PANDAS DATAFRAME
      
    # Define the pandas dataframe
    sales_df = pd.DataFrame({
        "actual"    : [136, 120, 138, 155, 149],
        "predicted" : [134, 124, 132, 141, 149]
    })
  
    # Get MAPE for pandas series as parameters
    print("pandas df:", 
          calculate_mape(sales_df.actual, 
                         sales_df.predicted), "%")

输出：

MAPE 输出 – 1

MAPE 输出是一个非负浮点。 MAPE 的最佳值为 0.0，而较高的值确定预测不够准确。然而，一个 MAPE 值应该有多大才能将其称为低效预测取决于用例。在上面的输出中，我们可以看到预测值足够好，因为 MAPE 表明每天销售的预测值存在 3% 的误差。

如果您在Python中处理时间序列数据，您可能正在使用 pandas 或 NumPy。在这种情况下，您可以使用以下代码获取 MAPE 输出。

Python

import pandas as pd
import numpy as np
  
# Define the function to return the MAPE values
def calculate_mape(actual, predicted) -> float:
  
    # Convert actual and predicted
    # to numpy array data type if not already
    if not all([isinstance(actual, np.ndarray),
                isinstance(predicted, np.ndarray)]):
        actual, predicted = np.array(actual), 
        np.array(predicted)
  
    # Calculate the MAPE value and return
    return round(np.mean(np.abs((
      actual - predicted) / actual)) * 100, 2)
  
if __name__ == '__main__':
  
    # CALCULATE MAPE FROM PYTHON LIST
    actual    = [136, 120, 138, 155, 149]
    predicted = [134, 124, 132, 141, 149]
  
    # Get MAPE for python list as parameters
    print("py list  :",
          calculate_mape(actual,
                         predicted), "%")
  
    # CALCULATE MAPE FROM NUMPY ARRAY
    actual    = np.array([136, 120, 138, 155, 149])
    predicted = np.array([134, 124, 132, 141, 149])
  
    # Get MAPE for python list as parameters
    print("np array :", 
          calculate_mape(actual,
                         predicted), "%")
  
    # CALCULATE MAPE FROM PANDAS DATAFRAME
      
    # Define the pandas dataframe
    sales_df = pd.DataFrame({
        "actual"    : [136, 120, 138, 155, 149],
        "predicted" : [134, 124, 132, 141, 149]
    })
  
    # Get MAPE for pandas series as parameters
    print("pandas df:", 
          calculate_mape(sales_df.actual, 
                         sales_df.predicted), "%")

输出：

MAPE 输出 – 2

在上面的程序中，我们描述了一个函数` calculate_mape() `，它对给定的Python列表、NumPy 数组或 pandas 系列进行 MAPE 计算。输出与将相同的数据作为参数传递给所有 3 种数据类型格式的函数相同。