XGBoosting Home | About | Contact | Examples

XGBoost Report Execution Time

Measuring the execution time of XGBoost predictions is crucial for understanding the performance of your model and optimizing its efficiency.

In this example, we’ll demonstrate how to use the time.perf_counter() function to measure prediction times and compare the performance of single-threaded and multi-threaded predictions.

import time
from sklearn.datasets import make_classification
from xgboost import XGBClassifier

# Generate a synthetic dataset for binary classification
X, y = make_classification(n_samples=1000000, n_features=20, n_informative=10, n_redundant=5, random_state=42)

# Initialize two XGBClassifier models with different numbers of threads
model_single_thread = XGBClassifier(n_jobs=1, nthread=1, random_state=42)
model_multi_thread = XGBClassifier(n_jobs=-1, nthread=4, random_state=42)

# Train both models on the generated dataset
model_single_thread.fit(X, y)
model_multi_thread.fit(X, y)

# Measure prediction time for the single-threaded model
start_time = time.perf_counter()
predictions_single_thread = model_single_thread.predict(X)
end_time = time.perf_counter()
single_thread_time = end_time - start_time

# Measure prediction time for the multi-threaded model
start_time = time.perf_counter()
predictions_multi_thread = model_multi_thread.predict(X)
end_time = time.perf_counter()
multi_thread_time = end_time - start_time

# Print the prediction times
print(f"Single-threaded prediction time: {single_thread_time:.3f} seconds")
print(f"Multi-threaded prediction time: {multi_thread_time:.3f} seconds")

You may see results that look as follows:

Single-threaded prediction time: 1.823 seconds
Multi-threaded prediction time: 0.547 seconds

In this example:

  1. We generate a synthetic dataset for binary classification using sklearn.datasets.make_classification() with 100,000 samples and 20 features.

  2. We initialize two XGBClassifier models: one with a single thread (n_jobs=1, nthread=1) and another with multiple threads (n_jobs=-1, nthread=4).

  3. We train both models on the generated dataset using the fit() method.

  4. To measure the prediction time for the single-threaded model, we:

    • Record the start time using time.perf_counter().
    • Make predictions on the training data using the predict() method.
    • Record the end time using time.perf_counter().
    • Calculate the prediction time by subtracting the start time from the end time.
  5. We repeat the same process to measure the prediction time for the multi-threaded model.

  6. Finally, we print the prediction times for both models to compare their performance.

By using time.perf_counter(), you can accurately measure the execution time of XGBoost predictions and assess the impact of using different numbers of threads. This information can help you make informed decisions about the optimal configuration for your model, balancing prediction speed and resource utilization.



See Also