When using XGBoost with RandomizedSearchCV for hyperparameter tuning, the n_jobs
parameter can be leveraged to parallelize computations across multiple CPU cores, potentially speeding up the process.
However, finding the optimal configuration for n_jobs
in both XGBoost and RandomizedSearchCV requires some experimentation, as the available resources need to be divided between model training and the search itself.
This example demonstrates how to set n_jobs
for both XGBoost and RandomizedSearchCV, and compares the execution times of different configurations to help find the most efficient setup for your specific use case.
# XGBoosting.com
# XGBoost Set n_jobs for RandomizedSearchCV
import os
# Set environment variable to limit OpenMP to 1 thread
os.environ["OMP_NUM_THREADS"] = "1"
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, RandomizedSearchCV
from xgboost import XGBClassifier
import time
import multiprocessing
# Get the number of available CPU cores
n_cores = multiprocessing.cpu_count()
# Generate a synthetic dataset
X, y = make_classification(n_samples=10000, n_features=20, random_state=42)
# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define the parameter distribution for randomized search
param_dist = {
'max_depth': [3, 5, 7, 9],
'learning_rate': [0.1, 0.01, 0.001],
'subsample': [0.8, 1.0]
}
# Function to train the model and perform randomized search
def train_and_tune(model_n_jobs, random_search_n_jobs):
model = XGBClassifier(n_estimators=100, n_jobs=model_n_jobs, random_state=42)
random_search = RandomizedSearchCV(estimator=model, param_distributions=param_dist, n_iter=10, cv=3, n_jobs=random_search_n_jobs)
start_time = time.perf_counter()
random_search.fit(X_train, y_train)
end_time = time.perf_counter()
return end_time - start_time
# Compare different n_jobs configurations
configurations = [
(1, 1),
(n_cores, n_cores),
(n_cores//2, n_cores//2),
(1, n_cores),
(1, n_cores//2),
(n_cores, 1),
(n_cores//2, 1)
]
for model_n_jobs, random_search_n_jobs in configurations:
execution_time = train_and_tune(model_n_jobs, random_search_n_jobs)
print(f"Model n_jobs: {model_n_jobs}, Random Search n_jobs: {random_search_n_jobs}, Execution Time: {execution_time:.2f} seconds")
You may see results that look as follows:
Model n_jobs: 1, Random Search n_jobs: 1, Execution Time: 11.05 seconds
Model n_jobs: 8, Random Search n_jobs: 8, Execution Time: 11.48 seconds
Model n_jobs: 4, Random Search n_jobs: 4, Execution Time: 7.37 seconds
Model n_jobs: 1, Random Search n_jobs: 8, Execution Time: 5.71 seconds
Model n_jobs: 1, Random Search n_jobs: 4, Execution Time: 6.03 seconds
Model n_jobs: 8, Random Search n_jobs: 1, Execution Time: 7.26 seconds
Model n_jobs: 4, Random Search n_jobs: 1, Execution Time: 5.20 seconds
In this example, we:
- Set the
OMP_NUM_THREADS
environment variable to “1” to limit OpenMP to a single thread. - Generate a synthetic dataset using
sklearn.datasets.make_classification
. - Define an
XGBClassifier
withn_estimators
set to 100 andrandom_state
set to 42 for reproducibility. - Define a parameter distribution for randomized search with different values for
max_depth
,learning_rate
, andsubsample
. - Create a function
train_and_tune
that takesmodel_n_jobs
andrandom_search_n_jobs
as arguments, trains the model, performs randomized search, and returns the execution time. - Compare different
n_jobs
configurations by callingtrain_and_tune
with various combinations ofmodel_n_jobs
andrandom_search_n_jobs
. - Print the execution time for each configuration.
Based on the results, you can determine which configuration of n_jobs
works best for your system and dataset.
Experiment with different configurations to find the optimal balance between parallelizing model training and the randomized search itself. Keep in mind that the ideal setup may vary depending on the size of your dataset, the complexity of your model, and the available computational resources.