When it comes to gradient boosting for regression tasks, both XGBoost and CatBoost are popular choices known for their strong performance and efficiency.

But which one trains faster?

Let’s put them head-to-head and find out.

First, ensure you have the `catboost`

library installed. If not, you can install it using pip:

```
pip install catboost
```

Now, let’s set up our speed test:

```
from sklearn.datasets import make_regression
from xgboost import XGBRegressor
from catboost import CatBoostRegressor
import time
# Generate a synthetic regression dataset
X, y = make_regression(n_samples=100000, n_features=10, noise=0.1, random_state=42)
# Initialize the regressors with comparable hyperparameters
xgb_reg = XGBRegressor(n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42)
cb_reg = CatBoostRegressor(iterations=100, learning_rate=0.1, max_depth=3, random_state=42, verbose=False)
# Fit XGBRegressor and measure the training time
start_time = time.perf_counter()
xgb_reg.fit(X, y)
xgb_time = time.perf_counter() - start_time
print(f"XGBRegressor training time: {xgb_time:.2f} seconds")
# Fit CatBoostRegressor and measure the training time
start_time = time.perf_counter()
cb_reg.fit(X, y)
cb_time = time.perf_counter() - start_time
print(f"CatBoostRegressor training time: {cb_time:.2f} seconds")
```

We begin by generating a large synthetic regression dataset with 100,000 samples and 10 features using scikit-learn’s `make_regression`

. This provides ample data to observe a significant difference in training times.

Next, we initialize our competitors: `XGBRegressor`

and `CatBoostRegressor`

. For a fair comparison, we use similar hyperparameters for both:

- 100 boosting iterations
- Learning rate of 0.1
- Max tree depth of 3
`random_state`

set to 42 for reproducibility

We then fit each regressor on the dataset, measuring the training time using the `time`

module. The `start_time`

is recorded before fitting, and the elapsed time is calculated once fitting completes.

Finally, we print the training times for both regressors.

Here’s an example output:

```
XGBRegressor training time: 0.16 seconds
CatBoostRegressor training time: 0.74 seconds
```

Below is an updated comparison that repeats each each experiment many times and plots the distributions.

```
import time
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
from xgboost import XGBRegressor
from catboost import CatBoostRegressor
# Generate a synthetic regression dataset
X, y = make_regression(n_samples=100000, n_features=10, noise=0.1, random_state=42)
# Initialize the regressors with comparable hyperparameters
xgb_reg = XGBRegressor(n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42)
cb_reg = CatBoostRegressor(iterations=100, learning_rate=0.1, max_depth=3, random_state=42, verbose=False)
# Lists to store training times
xgb_times = []
gb_times = []
# Run the benchmark 10 times
for i in range(10):
# Measure training time for XGBRegressor
start_time = time.perf_counter()
xgb_reg.fit(X, y)
xgb_duration = time.perf_counter() - start_time
xgb_times.append(xgb_duration)
# Measure training time for CatBoostRegressor
start_time = time.perf_counter()
cb_reg.fit(X, y)
gb_duration = time.perf_counter() - start_time
gb_times.append(gb_duration)
# Report progress
print(f'> {i} xgb: {xgb_duration:.3f}, gb: {gb_duration:.3f}')
# Calculate mean and standard deviation of training times
xgb_mean = np.mean(xgb_times)
xgb_std = np.std(xgb_times)
gb_mean = np.mean(gb_times)
gb_std = np.std(gb_times)
# Print mean and standard deviation of training times
print(f"XGBRegressor mean training time: {xgb_mean:.2f} seconds (std: {xgb_std:.2f})")
print(f"CatBoostRegressor mean training time: {gb_mean:.2f} seconds (std: {gb_std:.2f})")
# Plot the distributions as side-by-side boxplots using matplotlib
plt.figure(figsize=(10, 6))
plt.boxplot([xgb_times, gb_times], labels=['XGBoost', 'CatBoostRegressor'])
plt.ylabel('Training Time (seconds)')
plt.title('Training Time Comparison')
plt.show()
```

The results may look something like the following:

```
> 0 xgb: 0.142, gb: 0.713
> 1 xgb: 0.145, gb: 0.668
> 2 xgb: 0.149, gb: 0.669
> 3 xgb: 0.183, gb: 0.666
> 4 xgb: 0.152, gb: 0.671
> 5 xgb: 0.190, gb: 0.674
> 6 xgb: 0.146, gb: 0.698
> 7 xgb: 0.147, gb: 0.670
> 8 xgb: 0.165, gb: 0.792
> 9 xgb: 0.174, gb: 0.691
XGBRegressor mean training time: 0.16 seconds (std: 0.02)
CatBoostRegressor mean training time: 0.69 seconds (std: 0.04)
```

Exact times will vary based on your hardware, but in this case, `XGBRegressor`

trains about twice as fast as `CatBoostRegressor`

.

This speed advantage can be significant when working with large datasets or when you need to iterate quickly. By leveraging XGBoost’s speed, you can experiment with more features and hyperparameters, ultimately building better models faster.

Of course, training speed isn’t the only consideration. CatBoost has its own strengths, like excellent handling of categorical features. The best choice depends on your specific dataset and requirements.