如何在Python中计算 SMAPE？

在本文中，我们将了解如何计算一种用于确定预测准确性的方法，称为Python中的对称平均绝对百分比误差（或简称为 SMAPE）。

SMAPE 是克服 MAPE 预测误差测量限制的替代方法之一。与平均绝对百分比误差相比，SMAPE 既有下限又有上限，因此，它被称为对称。 SMAPE 中的“S”代表对称，“M”代表平均值，它采用一系列平均值，“A”代表绝对值，它使用绝对值来防止正负误差相互抵消，' P' 是使该准确度指标成为相对指标的百分比，“E”代表错误，因为该指标有助于确定我们的预测存在的误差量。

SMAPE 的公式：

SMAPE 公式

考虑下面的例子，我们有一个商店的销售信息。日列代表我们所指的天数，实际销售额列代表各天的实际销售额，而预测销售额列代表销售数字的预测值（可能使用 ML 模型）。最后一列是倒数第三列和倒数第二列之间的划分。

Day No.	Actual Sales	Forecast Sales	A \|forecast – actual\|	B (\|actual\| + \|forecast\|) / 2	A / B
1	136	134	2	135	0.014
2	120	124	4	122	0.032
3	138	132	6	135	0.044
4	155	141	14	148	0.094
5	149	149	0	149	0

上述示例的 SMAPE 值将是 A/B 列中条目的平均值。该值为0.0368 。

在Python中计算 SMAPE

Python

import pandas as pd
import numpy as np
  
# Define the function to return the SMAPE value
def calculate_smape(actual, predicted) -> float:
  
    # Convert actual and predicted to numpy
    # array data type if not already
    if not all([isinstance(actual, np.ndarray), 
                isinstance(predicted, np.ndarray)]):
        actual, predicted = np.array(actual),
        np.array(predicted)
  
    return round(
        np.mean(
            np.abs(predicted - actual) / 
            ((np.abs(predicted) + np.abs(actual))/2)
        )*100, 2
    )
  
  
if __name__ == '__main__':
  
    # CALCULATE SMAPE FROM PYTHON LIST
  
    actual    = [136, 120, 138, 155, 149]
    predicted = [134, 124, 132, 141, 149]
  
    # Get SMAPE for python list as parameters
    print("py list  :", 
          calculate_smape(actual, predicted), "%")
  
    # CALCULATE SMAPE FROM NUMPY ARRAY
    actual    = np.array([136, 120, 138, 155, 149])
    predicted = np.array([134, 124, 132, 141, 149])
  
    # Get SMAPE for python list as parameters
    print("np array :", 
          calculate_smape(actual, predicted), "%")
  
    # CALCULATE SMAPE FROM PANDAS DATAFRAME
    # Define the pandas dataframe
    sales_df = pd.DataFrame({
        "actual"    : [136, 120, 138, 155, 149],
        "predicted" : [134, 124, 132, 141, 149]
    })
  
    # Get SMAPE for pandas series as parameters
    print("pandas df:", calculate_smape(sales_df.actual, 
                                        sales_df.predicted), "%")

输出：

py list  : 3.73 %
np array : 3.73 %
pandas df: 3.73 %

解释：

在程序中，我们计算了以 3 种不同数据类型格式作为函数参数提供的同一数据集的 SMAPE 度量值，即Python列表、NumPy 数组和 pandas 数据框。该函数被泛化为使用任何类似Python系列的数据作为输入参数。该函数首先将数据类型转换为 numpy 数组，以便使用 NumPy 方法进行计算变得更容易。 return 语句可以通过下图来解释：

SMAPE 代码 Exl