XGBoosting Home | About | Contact | Examples

Configure XGBoost "max_bin" Parameter

The max_bin parameter in XGBoost controls the maximum number of bins used for binning continuous features. Adjusting max_bin can impact the model’s performance, memory usage, and training speed.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from xgboost import XGBRegressor

# Generate synthetic data
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the XGBoost regressor with a max_bin value
model = XGBRegressor(max_bin=128, eval_metric='rmse')

# Fit the model
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)

Understanding the “max_bin” Parameter

The max_bin parameter determines the maximum number of bins used for binning continuous features during the tree construction process. Binning is a process of discretizing continuous features into a finite number of bins, which can help speed up training and reduce memory usage. The default value of max_bin in XGBoost is 256.

Choosing the Right “max_bin” Value

The value of max_bin affects the model’s performance, memory usage, and training speed:

When setting max_bin, consider the trade-off between model performance, memory usage, and training speed:

Practical Tips



See Also