📅  最后修改于: 2023-12-03 15:19:50.883000             🧑  作者: Mango
ROC (Receiver Operating Characteristic) curve is a well-known evaluation metric for binary classification problems. The ROC curve shows the performance of a classifier at various classification thresholds. The AUC (Area Under the Curve) score is a single number that summarizes the performance of a classifier over all possible classification thresholds. In this tutorial, we will discuss how to plot the ROC curve and AUC score for a binary classification problem.
Before we get started, we need to import the required libraries. We will use the scikit-learn library to generate the ROC curve and AUC score. We will also use the matplotlib library to plot the ROC curve.
from sklearn.metrics import roc_auc_score, roc_curve
import matplotlib.pyplot as plt
Next, we need to generate the predictions and true labels for our binary classification problem. We will use the following sample data for our example.
import pandas as pd
import numpy as np
# sample data
y_true = np.array([0, 1, 1, 0, 1, 0, 0, 1, 0, 1])
y_pred = np.array([0.1, 0.8, 0.9, 0.3, 0.6, 0.2, 0.4, 0.7, 0.2, 0.5])
In the above code, y_true
is the true labels and y_pred
is the predicted probabilities for the positive class.
Now, we can compute the ROC AUC score for our binary classification problem using the roc_auc_score
function from scikit-learn.
roc_auc = roc_auc_score(y_true, y_pred)
print('ROC AUC Score:', roc_auc)
The output of the above code will be:
ROC AUC Score: 0.8333333333333333
Finally, we can plot the ROC curve for our binary classification problem using the roc_curve
function from scikit-learn and the matplotlib
library.
fpr, tpr, thresholds = roc_curve(y_true, y_pred)
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'k--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend(loc="lower right")
plt.show()
The output of the above code will be a graphical representation of the ROC curve.
In the above code, we first compute the false positive rate (fpr), true positive rate (tpr), and thresholds using the roc_curve
function. We then plot the ROC curve using the plot
function, set the limits and labels for the axes, add a title, and display the legend using the legend
function. Finally, we display the ROC curve using the show
function.
In this tutorial, we have discussed how to plot the ROC curve and AUC score for a binary classification problem. The ROC curve provides a graphical representation of the performance of a classifier at various classification thresholds, while the AUC score summarizes its performance over all possible thresholds. By understanding these concepts, we can better evaluate the performance of our classifiers and make more informed decisions.