XGBoosting Home | About | Contact | Examples

XGBoost Multiple CPUs for Training and Prediction

XGBoost is known for its speed and efficiency and supports multiple CPU (multithreading) out of the box for training and prediction.

By utilizing multiple CPUs, you can further accelerate your training and prediction tasks.

In this example, we’ll show you how to configure XGBoost to take full advantage of your machine’s processing power.

Here’s a quick guide on how to enable multiple CPU support in XGBoost:

import os
# Set the number of threads for BLAS operations
os.environ['OMP_NUM_THREADS'] = '4'

from sklearn.datasets import make_classification
from xgboost import XGBClassifier

# Generate a synthetic dataset for binary classification
X, y = make_classification(n_samples=10000, n_features=20, n_informative=10, n_redundant=5, random_state=42)

# Initialize an XGBClassifier with multithreading enabled
model = XGBClassifier(n_jobs=-1, nthread=4, random_state=42)

# Train the XGBoost model
model.fit(X, y)

# Make predictions on the training data
predictions = model.predict(X)

# Print the first few predictions
print(predictions[:5])

In this example:

  1. We set the OMP_NUM_THREADS environment variable to ‘4’ to control the number of threads used for BLAS operations. This ensures that XGBoost uses the specified number of threads for these low-level operations.

  2. We generate a synthetic dataset for binary classification using sklearn.datasets.make_classification() with 10,000 samples and 20 features.

  3. We initialize an XGBClassifier with n_jobs set to -1, which automatically uses all available CPU cores for training and prediction. We also set nthread to 4 to explicitly specify the number of threads for XGBoost’s internal computations.

  4. We train the XGBoost model using the fit() method on the generated dataset.

  5. We make predictions on the training data using the predict() method.

  6. Finally, we print the first few predictions to verify the model’s output.

By leveraging multithreading, XGBoost can significantly speed up training and prediction times, especially on larger datasets.

The n_jobs and nthread parameters provide an easy way to enable this parallelization, while setting the OMP_NUM_THREADS environment variable ensures that the underlying BLAS operations also utilize the specified number of threads.



See Also