When dealing with binary classification tasks, the Area Under the ROC Curve (AUC) is a widely used evaluation metric. AUC measures the model’s ability to discriminate between classes, with a higher value indicating better performance.
By setting eval_metric='auc'
, you can track your model’s AUC during training and leverage early stopping to avoid overfitting.
Here’s an example of how to use AUC as the evaluation metric with XGBoost and scikit-learn:
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
import matplotlib.pyplot as plt
# Generate a synthetic binary classification dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create an XGBClassifier with AUC as the evaluation metric
model = XGBClassifier(n_estimators=100, eval_metric='auc', early_stopping_rounds=10, random_state=42)
# Train the model with early stopping
model.fit(X_train, y_train, eval_set=[(X_test, y_test)])
# Retrieve the AUC values from the training process
results = model.evals_result()
epochs = len(results['validation_0']['auc'])
x_axis = range(0, epochs)
# Plot the AUC values
plt.figure()
plt.plot(x_axis, results['validation_0']['auc'], label='Test')
plt.legend()
plt.xlabel('Number of Boosting Rounds')
plt.ylabel('AUC')
plt.title('XGBoost AUC Performance')
plt.show()
In this example, we generate a synthetic binary classification dataset using scikit-learn’s make_classification
function and split it into training and testing sets.
We create an instance of XGBClassifier
and set eval_metric='auc'
to specify AUC as the evaluation metric. We also set early_stopping_rounds=10
to enable early stopping if the AUC doesn’t improve for 10 consecutive rounds.
During training, we pass the testing set as the eval_set
to monitor the model’s performance on unseen data. After training, we retrieve the AUC values using the evals_result()
method.
Finally, we plot the AUC values against the number of boosting rounds to visualize the model’s performance during training. This plot helps us assess whether the model is overfitting or underfitting and determines the optimal number of boosting rounds.
Using AUC as the evaluation metric allows us to effectively monitor the model’s discrimination ability, prevent overfitting through early stopping, and select the best model based on the highest AUC value.