XGBoosting Home | About | Contact | Examples

Configure XGBoost Early Stopping Via Callback

Early stopping is a regularization technique that helps prevent overfitting in XGBoost models by halting the training process when the model’s performance on a validation set stops improving.

By using the xgboost.callback.EarlyStopping callback, you can easily configure early stopping behavior and control the conditions under which training is stopped.

Here’s an example that demonstrates how to use the EarlyStopping callback to configure early stopping in XGBoost:

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import xgboost as xgb

# Load the breast cancer dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Split the data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the early stopping callback
early_stop = xgb.callback.EarlyStopping(rounds=10, metric_name='error')

# Set up the XGBoost model with early stopping callback
params = {
    'objective': 'binary:logistic',
    'eval_metric': 'error',
    'callbacks': [early_stop]
}

# Train the model
model = xgb.XGBClassifier(**params)
model.fit(X_train, y_train, eval_set=[(X_val, y_val)])

# Print the best iteration and validation score
print(f"Best iteration: {model.best_iteration}")
print(f"Best validation score: {model.best_score}")

The EarlyStopping callback takes several key parameters, such as:

Tuning these parameters is problem-dependent and may require some experimentation. Consider the following guidelines:

It’s crucial to monitor the model’s performance on the validation set and adjust the early stopping settings accordingly. If the model is stopping too early, you may want to increase rounds or decrease min_delta. Conversely, if the model is overfitting, decreasing rounds or increasing min_delta can help.

By leveraging the EarlyStopping callback and carefully tuning its parameters, you can effectively regularize your XGBoost models and find the optimal balance between model complexity and generalization performance.



See Also