XGBoost Confidence Interval using Jackknife Resampling

Jackknife resampling provides an alternative to the bootstrap for estimating confidence intervals of XGBoost model performance metrics, particularly when computational efficiency is less of a priority.

Unlike the bootstrap, which requires fitting the model on numerous resampled datasets, the Jackknife method refits the model only once for each observation in the original dataset. This can be computationally prohibitive for large datasets, but can be less computationally intensive than the bootstrap for smaller datasets.

This example demonstrates how to use Jackknife resampling to estimate a 95% confidence interval for the error of an XGBoost model trained on a synthetic regression dataset.

# XGBoosting.com
# Evaluate XGBoost Model Performance Confidence Intervals using Jackknife Resampling
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from xgboost import XGBRegressor
import numpy as np

# Generate a synthetic regression dataset
X, y = make_regression(n_samples=100, n_features=10, n_informative=5, noise=0.1, random_state=42)

# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define a function to compute Jackknife replicates of an error metric
def jackknife(model, X, y):
    n = len(X)
    scores = []
    for i in range(n):
        X_jack = np.delete(X, i, axis=0)
        y_jack = np.delete(y, i)
        model.fit(X_jack, y_jack)
        y_pred = model.predict(X[[i]])
        abs_error = abs(y[i] - y_pred[0])
        scores.append(abs_error)
    return np.array(scores)

# Instantiate an XGBRegressor with default hyperparameters
model = XGBRegressor(random_state=42)

# Compute the Jackknife confidence interval for abs error
error_scores = jackknife(model, X_train, y_train)
ci_low, ci_high = np.percentile(error_scores, [2.5, 97.5])

print(f"Mean Absolute Error: {error_scores.mean():.3f}")
print(f"95% CI: [{ci_low:.3f}, {ci_high:.3f}]")

The code first generates a synthetic regression dataset using scikit-learn’s make_regression function and splits the data into train and test sets.

Next, we define a jackknife function that takes a model and training data as input. For each observation in the training data, this function leaves out that observation, refits the model on the remaining data, and computes the absolute error on the left-out observation. The function returns an array of Jackknife errors.

An XGBClassifier is instantiated with default hyperparameters, and the jackknife function is called with the model and training data to compute the Jackknife replicates of absolute error.

Finally, the 2.5th and 97.5th percentiles of the Jackknife absolute error replicates are computed to obtain the 95% confidence interval bounds. The mean absolute error and confidence interval are printed.

See Also