When training an XGBoost model and performing hyperparameter tuning with grid search, the n_jobs
parameter can be used to parallelize the workload across multiple CPU cores.
However, finding the optimal configuration for n_jobs
can be tricky, as it needs to be divided between model training and grid search.
This example demonstrates how to configure n_jobs
for both tasks and compares the execution time of different configurations.
# XGBoosting.com
# XGBoost Configure n_jobs for Grid Search
import os
# Set environment variable to limit OpenMP to 1 thread
os.environ["OMP_NUM_THREADS"] = "1"
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV
from xgboost import XGBClassifier
import time
import multiprocessing
# Get the number of available CPU cores
n_cores = multiprocessing.cpu_count()
# Generate a synthetic dataset
X, y = make_classification(n_samples=10000, n_features=20, random_state=42)
# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define the parameter grid for grid search
param_grid = {
'max_depth': [3, 5, 7],
'learning_rate': [0.1, 0.01, 0.001],
'subsample': [0.8, 1.0]
}
# Function to train the model and perform grid search
def train_and_tune(model_n_jobs, grid_search_n_jobs):
model = XGBClassifier(n_estimators=100, n_jobs=model_n_jobs, random_state=42)
grid_search = GridSearchCV(estimator=model, param_grid=param_grid, cv=3, n_jobs=grid_search_n_jobs)
start_time = time.perf_counter()
grid_search.fit(X_train, y_train)
end_time = time.perf_counter()
return end_time - start_time
# Compare different n_jobs configurations
configurations = [
(1, 1),
(n_cores, n_cores),
(n_cores//2, n_cores//2),
(1, n_cores),
(1, n_cores//2),
(n_cores, 1),
(n_cores//2, 1)
]
for model_n_jobs, grid_search_n_jobs in configurations:
execution_time = train_and_tune(model_n_jobs, grid_search_n_jobs)
print(f"Model n_jobs: {model_n_jobs}, Grid Search n_jobs: {grid_search_n_jobs}, Execution Time: {execution_time:.2f} seconds")
You may see results that look as follows:
Model n_jobs: 1, Grid Search n_jobs: 1, Execution Time: 14.12 seconds
Model n_jobs: 8, Grid Search n_jobs: 8, Execution Time: 13.79 seconds
Model n_jobs: 4, Grid Search n_jobs: 4, Execution Time: 8.06 seconds
Model n_jobs: 1, Grid Search n_jobs: 8, Execution Time: 6.10 seconds
Model n_jobs: 1, Grid Search n_jobs: 4, Execution Time: 6.36 seconds
Model n_jobs: 8, Grid Search n_jobs: 1, Execution Time: 7.42 seconds
Model n_jobs: 4, Grid Search n_jobs: 1, Execution Time: 7.32 seconds
Experiment with different configurations to find the optimal balance for your specific use case.
In this example, we:
- We set the
OMP_NUM_THREADS
environment variable to “1” to limit OpenMP to a single thread. - Generate a synthetic dataset using
sklearn.datasets.make_classification
. - Define an
XGBClassifier
withn_estimators
set to 100 andrandom_state
set to 42 for reproducibility. - Define a parameter grid for grid search with different values for
max_depth
,learning_rate
, andsubsample
. - Create a function
train_and_tune
that takesmodel_n_jobs
andgrid_search_n_jobs
as arguments, trains the model, performs grid search, and returns the execution time. - Compare different
n_jobs
configurations by callingtrain_and_tune
with various combinations ofmodel_n_jobs
andgrid_search_n_jobs
. - Print the execution time for each configuration.
Based on the results, you can determine which configuration of n_jobs
works best for your system.
In general, if your dataset is large and the model training time dominates the overall execution time, setting model_n_jobs
to 1 the number of available CPU cores and grid_search_n_jobs
to 1 might be a good choice.
On the other hand, if the dataset is relatively small and the grid search time dominates, setting model_n_jobs
to 1 and grid_search_n_jobs
to the number of available CPU cores might be more efficient.