Halving Grid Search for XGBoost Hyperparameters

Tune

Halving grid search is a more efficient alternative to standard grid search for finding optimal XGBoost hyperparameters.

It uses a successive halving strategy to eliminate less promising hyperparameter configurations early, reducing computational cost.

Scikit-learn’s HalvingGridSearchCV class makes it easy to implement halving grid search with XGBoost.

At the time of writing, halving grid search is an experimental feature and requires an additional import to use:

from sklearn.experimental import enable_halving_search_cv

Example of halving grid search:

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.experimental import enable_halving_search_cv
from sklearn.model_selection import HalvingGridSearchCV
from xgboost import XGBClassifier

# Load dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define parameter grid
param_grid = {
    'max_depth': [3, 5, 7],
    'min_child_weight': [1, 3, 5],
    'subsample': [0.6, 0.8, 1.0],
    'colsample_bytree': [0.6, 0.8, 1.0],
    'learning_rate': [0.01, 0.1, 0.3]
}

# Create XGBoost classifier
xgb = XGBClassifier(n_estimators=100, objective='binary:logistic', random_state=42)

# Perform halving grid search
halving_search = HalvingGridSearchCV(estimator=xgb, param_grid=param_grid, cv=3, factor=3, resource='n_estimators', max_resources=100, verbose=2)
halving_search.fit(X_train, y_train)

# Print best parameters
print(f"Best parameters: {halving_search.best_params_}")
print(f"Best score: {halving_search.best_score_}")

The key differences from the standard grid search code snippet are:

We use HalvingGridSearchCV instead of GridSearchCV.
We have additional parameters factor, resource, and max_resources for controlling the halving search process.

Here’s how halving search works:

It starts with a small amount of resources (e.g., a small number of n_estimators) for each hyperparameter configuration.
It evaluates all configurations and eliminates the least promising ones based on the factor parameter.
It increases the resources for the remaining configurations and repeats the process until max_resources is reached.

This process can significantly reduce the computational cost compared to standard grid search, especially when dealing with a large hyperparameter space. However, there is a risk of eliminating a configuration too early that could have performed well with more resources. Careful selection of the factor and max_resources parameters can help mitigate this risk.

Halving grid search is a powerful tool for efficiently tuning XGBoost hyperparameters. It can save significant computational resources while still finding high-performing hyperparameter configurations. Give it a try in your next XGBoost project!

See Also