Random search is an alternative to grid search for finding optimal XGBoost hyperparameters.

Instead of exhaustively searching through a predefined grid, random search samples hyperparameter values randomly from a specified distribution.

This can be more efficient, especially when dealing with large hyperparameter spaces.

Here’s how to perform random search for XGBoost using scikit-learn:

```
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split, RandomizedSearchCV
from xgboost import XGBClassifier
from scipy.stats import uniform
# Load dataset
data = load_breast_cancer()
X, y = data.data, data.target
# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define parameter distributions
param_dist = {
'max_depth': [3, 5, 7, 9, 11],
'min_child_weight': [1, 3, 5, 7],
'subsample': uniform(0.6, 0.4),
'colsample_bytree': uniform(0.6, 0.4),
'learning_rate': uniform(0.01, 0.29)
}
# Create XGBoost classifier
xgb = XGBClassifier(n_estimators=100, objective='binary:logistic', random_state=42)
# Perform random search
random_search = RandomizedSearchCV(estimator=xgb, param_distributions=param_dist, n_iter=50, cv=3, n_jobs=-1, verbose=2, random_state=42)
random_search.fit(X_train, y_train)
# Print best parameters
print(f"Best parameters: {random_search.best_params_}")
print(f"Best score: {random_search.best_score_}")
```

In this example:

We load the breast cancer dataset and split it into train and test sets.

We define a parameter distribution

`param_dist`

. For`max_depth`

and`min_child_weight`

, we provide a list of discrete values to sample from. For`subsample`

,`colsample_bytree`

, and`learning_rate`

, we use`scipy.stats.uniform`

to define a continuous distribution to sample from. The first argument is the lower bound and the second is the range (upper bound - lower bound).We create an XGBoost classifier

`xgb`

.We create a

`RandomizedSearchCV`

object`random_search`

, specifying the classifier, parameter distribution, number of iterations (`n_iter`

), and number of cross-validation splits (`cv`

). Setting`random_state`

ensures reproducibility.We fit

`random_search`

to the training data. This will randomly sample hyperparameters from`param_dist`

and evaluate the model for each combination.We print the best parameters and the corresponding best score.

Random search can be a good choice when you have a large hyperparameter space and limited computational resources. It allows you to explore a wide range of values without exhaustively searching through all possible combinations. The number of iterations `n_iter`

controls the number of random configurations to try.

As with grid search, you can use the best parameters found by random search to train your final model on the full training set and evaluate its performance on the test set.