XGBoosting Home | About | Contact | Examples

Verify CPU Core Utilization During XGBoost Model Training

When training XGBoost models, it’s important to ensure that all available CPU cores are being utilized to maximize performance.

The n_jobs parameter in XGBoost allows you to specify the number of CPU cores to use during training.

Setting n_jobs=-1 tells XGBoost to use all available cores.

In this example, we’ll demonstrate how to check if all CPU cores are being utilized when training an XGBoost model.

We’ll use Python’s psutil library to monitor CPU usage before and during model training.

Firstly, we must install the psutil library using our preferred Python package manager, such as pip:

pip install psutil

Next, we can fit an XGBoost model using all CPU cores and report CPU utilization:

import psutil
from sklearn.datasets import make_classification
from xgboost import XGBClassifier

# Get the number of logical CPU cores
num_cores = psutil.cpu_count(logical=True)
print(f"Number of logical CPU cores: {num_cores}")

# Generate a synthetic dataset
X, y = make_classification(n_samples=1000000, n_features=20, random_state=42)

# Define the XGBoost model with n_jobs=-1 to use all cores
model = XGBClassifier(n_estimators=100, n_jobs=-1)

# Get CPU usage before training
cpu_percent_before = psutil.cpu_percent(interval=1)
print(f"CPU usage before training: {cpu_percent_before}%")

# Train the XGBoost model
model.fit(X, y)

# Get CPU usage during training
cpu_percent_during = psutil.cpu_percent(interval=1)
print(f"CPU usage during training: {cpu_percent_during}%")

When you run this code, you should see output similar to the following:

Number of logical CPU cores: 8
CPU usage before training: 5.5%
CPU usage during training: 20.7%

The specific CPU usage percentages may vary depending on your system and other running processes.

However, you should observe a significant increase in CPU utilization during XGBoost training when n_jobs=-1.

To further explore the impact of the n_jobs parameter, you can repeat the process with different values:

# Define the XGBoost model with n_jobs=1 to use a single core
model = XGBClassifier(n_estimators=100, n_jobs=1)

With n_jobs=1, you should see a lower CPU usage percentage during training compared to n_jobs=-1.

By monitoring CPU usage and adjusting the n_jobs parameter, you can ensure that your XGBoost models are making efficient use of available CPU resources during training.



See Also