XGBoost Evaluate Model using the Bootstrap Method

Evaluate

The bootstrap method is a powerful resampling technique that can provide a robust estimate of your XGBoost model’s performance.

By repeatedly sampling your data with replacement, you can create multiple datasets for training and evaluating your model, gaining a more reliable assessment of its performance.

This example demonstrates how to use the bootstrap method to evaluate an XGBoost classifier on a synthetic dataset.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from xgboost import XGBClassifier
import numpy as np

# Generate a synthetic binary classification dataset
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42)

# Define the number of bootstrap iterations
n_iterations = 100

# Initialize an array to store the out-of-bag scores
oob_scores = np.zeros(n_iterations)

# Perform the bootstrap
for i in range(n_iterations):
    # Create a bootstrap sample
    indices = np.random.choice(len(X), size=len(X), replace=True)
    X_boot, y_boot = X[indices], y[indices]

    # Create an XGBClassifier
    model = XGBClassifier(eval_metric='error')

    # Train the model on the bootstrap sample
    model.fit(X_boot, y_boot)

    # Identify the out-of-bag samples
    oob_indices = np.array(list(set(range(len(X))) - set(indices)))
    X_oob, y_oob = X[oob_indices], y[oob_indices]

    # Evaluate the model on the out-of-bag samples
    y_pred_oob = model.predict(X_oob)
    oob_scores[i] = accuracy_score(y_oob, y_pred_oob)

# Print the mean and standard deviation of the out-of-bag scores
print(f"Bootstrap Accuracy: {oob_scores.mean():.2f} (+/- {oob_scores.std():.2f})")

Here’s what’s happening:

We generate a synthetic binary classification dataset using scikit-learn’s make_classification function.
We define the number of bootstrap iterations we want to perform (n_iterations).
We initialize an array to store the out-of-bag accuracy scores for each iteration.
We start the bootstrap loop:
- We create a bootstrap sample by randomly sampling the original dataset with replacement.
- We create an XGBClassifier and train it on the bootstrap sample.
- We identify the out-of-bag samples (those not included in the bootstrap sample).
- We evaluate the model on the out-of-bag samples and record the accuracy score.
After the loop, we calculate and print the mean and standard deviation of the out-of-bag scores.

The bootstrap method provides a way to assess your model’s performance without the need for a separate validation set. By evaluating on the out-of-bag samples, you can get an unbiased estimate of your model’s performance. The standard deviation of the out-of-bag scores gives you a sense of the variability in your model’s performance.

See Also