Bayesian Optimization of XGBoost Hyperparameters with scikit-optimize

Bayesian optimization is an efficient alternative to grid search for finding optimal hyperparameters in XGBoost.

Unlike grid search, which exhaustively evaluates all combinations of hyperparameters, Bayesian optimization intelligently selects the next set of hyperparameters to evaluate based on the results of previous evaluations. This can lead to finding better hyperparameters in fewer iterations.

Scikit-optimize provides an easy-to-use implementation of Bayesian optimization that integrates seamlessly with scikit-learn’s API.

First, you must install the scikit-optimize library.

For example:

pip install scikit-optimize

We can then use use Bayesian optimization to search XGBoost hyperparameters with the BayesSearchCV class.

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
from skopt import BayesSearchCV
from skopt.space import Real, Integer

# Load dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define search space
search_space = {
    'max_depth': Integer(3, 10),
    'learning_rate': Real(0.01, 0.3, prior='log-uniform'),
    'subsample': Real(0.5, 1.0),
    'colsample_bytree': Real(0.5, 1.0)
}

# Create XGBoost classifier
xgb = XGBClassifier(n_estimators=100, objective='binary:logistic', random_state=42)

# Perform Bayesian optimization
bayes_search = BayesSearchCV(estimator=xgb, search_spaces=search_space, n_iter=25, cv=3, n_jobs=-1, verbose=2)
bayes_search.fit(X_train, y_train)

# Print best parameters
print(f"Best parameters: {bayes_search.best_params_}")
print(f"Best score: {bayes_search.best_score_}")

In this example:

We load the breast cancer dataset, split it into train and test sets, and define the search space for the hyperparameters we want to optimize. The Integer and Real classes from scikit-optimize are used to specify the ranges and types of the hyperparameters.
We create an instance of the XGBoost classifier with basic parameters.
We create a BayesSearchCV object, specifying the classifier, search space, number of iterations (n_iter), cross-validation splits (cv), and number of jobs (n_jobs).
We fit the BayesSearchCV object to the training data. During fitting, it will use Bayesian optimization to select the next set of hyperparameters to evaluate based on the results of previous evaluations.
After fitting, we print the best hyperparameters and the corresponding best score.

Bayesian optimization can often find better hyperparameters than grid search in fewer iterations, especially for high-dimensional search spaces. However, it’s important to note that the efficiency of Bayesian optimization depends on the specific problem and dataset. In some cases, grid search may still be preferred. Nonetheless, Bayesian optimization is a powerful tool to have in your hyperparameter tuning toolkit, and scikit-optimize makes it easy to apply to XGBoost and other scikit-learn compatible estimators.

See Also