XGBoosting Home | About | Contact | Examples

Manually Search XGBoost Hyperparameters with For Loops

While automated hyperparameter tuning methods like grid search and random search are popular, manually tuning XGBoost hyperparameters with for loops can be a valuable approach, especially for learning how each hyperparameter affects model performance or when working with smaller datasets.

Here’s a code snippet that demonstrates how to manually tune XGBoost hyperparameters using for loops in Python:

import xgboost as xgb
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Load the dataset
dataset = fetch_california_housing()
X, y = dataset.data, dataset.target

# Split the data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the hyperparameter search space
max_depth_list = [3, 5, 7]
learning_rate_list = [0.01, 0.1, 0.3]
subsample_list = [0.5, 0.8, 1.0]

# Initialize variables to store the best hyperparameters and score
best_score = float("inf")
best_params = None

# Iterate over all combinations of hyperparameters
for max_depth in max_depth_list:
    for learning_rate in learning_rate_list:
        for subsample in subsample_list:
            # Define the XGBoost model with the current hyperparameters
            model = xgb.XGBRegressor(n_estimators=100, max_depth=max_depth,
                                     learning_rate=learning_rate, subsample=subsample)

            # Train the model
            model.fit(X_train, y_train)

            # Evaluate the model on the validation set
            y_pred = model.predict(X_val)
            score = mean_squared_error(y_val, y_pred)

            # Update the best hyperparameters and score if necessary
            if score < best_score:
                best_score = score
                best_params = {"max_depth": max_depth, "learning_rate": learning_rate, "subsample": subsample}

print(f"Best hyperparameters: {best_params}")
print(f"Best validation MSE: {best_score:.3f}")

Why Manual Tuning with For Loops is Useful

Manual tuning with for loops offers several advantages:

Moreover, manual tuning can be a valuable learning exercise for those new to XGBoost and hyperparameter optimization, as it allows for a hands-on understanding of the tuning process.

When to Use Manual Tuning with For Loops

Manual tuning with for loops is particularly useful in the following scenarios:

However, it’s important to note that for larger datasets or more complex search spaces, automated tuning methods like random search or Bayesian optimization may be more efficient.

Limitations and Considerations

While manual tuning with for loops has its advantages, it also has some limitations:

When deciding between manual and automated tuning methods, consider the trade-offs based on your specific problem, dataset size, and available computational resources.

Tips for Effective Manual Tuning

To get the most out of manual tuning with for loops, consider the following tips:

Remember, manual tuning is an iterative process. Experiment with different hyperparameter ranges and combinations, and refine your search space based on the results. With practice and experience, you’ll develop an intuition for effective hyperparameter tuning.



See Also