XGBoosting Home | About | Contact | Examples

XGBoost Configure "n_jobs" for Grid Search

When training an XGBoost model and performing hyperparameter tuning with grid search, the n_jobs parameter can be used to parallelize the workload across multiple CPU cores.

However, finding the optimal configuration for n_jobs can be tricky, as it needs to be divided between model training and grid search.

This example demonstrates how to configure n_jobs for both tasks and compares the execution time of different configurations.

# XGBoosting.com
# XGBoost Configure n_jobs for Grid Search
import os
# Set environment variable to limit OpenMP to 1 thread
os.environ["OMP_NUM_THREADS"] = "1"
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV
from xgboost import XGBClassifier
import time
import multiprocessing

# Get the number of available CPU cores
n_cores = multiprocessing.cpu_count()

# Generate a synthetic dataset
X, y = make_classification(n_samples=10000, n_features=20, random_state=42)

# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the parameter grid for grid search
param_grid = {
    'max_depth': [3, 5, 7],
    'learning_rate': [0.1, 0.01, 0.001],
    'subsample': [0.8, 1.0]
}

# Function to train the model and perform grid search
def train_and_tune(model_n_jobs, grid_search_n_jobs):
    model = XGBClassifier(n_estimators=100, n_jobs=model_n_jobs, random_state=42)
    grid_search = GridSearchCV(estimator=model, param_grid=param_grid, cv=3, n_jobs=grid_search_n_jobs)

    start_time = time.perf_counter()
    grid_search.fit(X_train, y_train)
    end_time = time.perf_counter()

    return end_time - start_time

# Compare different n_jobs configurations
configurations = [
    (1, 1),
    (n_cores, n_cores),
    (n_cores//2, n_cores//2),
    (1, n_cores),
    (1, n_cores//2),
    (n_cores, 1),
    (n_cores//2, 1)
]

for model_n_jobs, grid_search_n_jobs in configurations:
    execution_time = train_and_tune(model_n_jobs, grid_search_n_jobs)
    print(f"Model n_jobs: {model_n_jobs}, Grid Search n_jobs: {grid_search_n_jobs}, Execution Time: {execution_time:.2f} seconds")

You may see results that look as follows:

Model n_jobs: 1, Grid Search n_jobs: 1, Execution Time: 14.12 seconds
Model n_jobs: 8, Grid Search n_jobs: 8, Execution Time: 13.79 seconds
Model n_jobs: 4, Grid Search n_jobs: 4, Execution Time: 8.06 seconds
Model n_jobs: 1, Grid Search n_jobs: 8, Execution Time: 6.10 seconds
Model n_jobs: 1, Grid Search n_jobs: 4, Execution Time: 6.36 seconds
Model n_jobs: 8, Grid Search n_jobs: 1, Execution Time: 7.42 seconds
Model n_jobs: 4, Grid Search n_jobs: 1, Execution Time: 7.32 seconds

Experiment with different configurations to find the optimal balance for your specific use case.

In this example, we:

  1. We set the OMP_NUM_THREADS environment variable to “1” to limit OpenMP to a single thread.
  2. Generate a synthetic dataset using sklearn.datasets.make_classification.
  3. Define an XGBClassifier with n_estimators set to 100 and random_state set to 42 for reproducibility.
  4. Define a parameter grid for grid search with different values for max_depth, learning_rate, and subsample.
  5. Create a function train_and_tune that takes model_n_jobs and grid_search_n_jobs as arguments, trains the model, performs grid search, and returns the execution time.
  6. Compare different n_jobs configurations by calling train_and_tune with various combinations of model_n_jobs and grid_search_n_jobs.
  7. Print the execution time for each configuration.

Based on the results, you can determine which configuration of n_jobs works best for your system.

In general, if your dataset is large and the model training time dominates the overall execution time, setting model_n_jobs to 1 the number of available CPU cores and grid_search_n_jobs to 1 might be a good choice.

On the other hand, if the dataset is relatively small and the grid search time dominates, setting model_n_jobs to 1 and grid_search_n_jobs to the number of available CPU cores might be more efficient.



See Also