XGBoost Multi-Core Training and Prediction

XGBoost supports multi-core (multithreaded) training and prediction out of the box, allowing you to utilize multiple CPU cores for faster execution.

This example demonstrates how to enable multithreading in XGBoost using the n_jobs and nthread parameters, as well as how to set the number of threads for BLAS (Basic Linear Algebra Subprograms) operations using the OMP_NUM_THREADS environment variable.

import os
# Set the number of threads for BLAS operations
os.environ['OMP_NUM_THREADS'] = '4'

from sklearn.datasets import make_classification
from xgboost import XGBClassifier

# Generate a synthetic dataset for binary classification
X, y = make_classification(n_samples=10000, n_features=20, n_informative=10, n_redundant=5, random_state=42)

# Initialize an XGBClassifier with multithreading enabled
model = XGBClassifier(n_jobs=-1, nthread=4, random_state=42)

# Train the XGBoost model
model.fit(X, y)

# Make predictions on the training data
predictions = model.predict(X)

# Print the first few predictions
print(predictions[:5])

In this example:

We set the OMP_NUM_THREADS environment variable to ‘4’ to control the number of threads used for BLAS operations. This ensures that XGBoost uses the specified number of threads for these low-level operations.
We generate a synthetic dataset for binary classification using sklearn.datasets.make_classification() with 10,000 samples and 20 features.
We initialize an XGBClassifier with n_jobs set to -1, which automatically uses all available CPU cores for training and prediction. We also set nthread to 4 to explicitly specify the number of threads for XGBoost’s internal computations.
We train the XGBoost model using the fit() method on the generated dataset.
We make predictions on the training data using the predict() method.
Finally, we print the first few predictions to verify the model’s output.

By leveraging multithreading, XGBoost can significantly speed up training and prediction times, especially on larger datasets. The n_jobs and nthread parameters provide an easy way to enable this parallelization, while setting the OMP_NUM_THREADS environment variable ensures that the underlying BLAS operations also utilize the specified number of threads.

See Also