XGBoosting Home | About | Contact | Examples

Configure XGBoost "min_split_loss" Parameter

The min_split_loss parameter in XGBoost is an alias for the gamma parameter, which controls the minimum loss reduction required to make a split on a leaf node of the tree.

By adjusting min_split_loss, you can influence the model’s complexity and its ability to generalize.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier

# Generate synthetic data
X, y = make_classification(n_samples=1000, n_features=20, n_informative=2, n_redundant=10, random_state=42)

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the XGBoost classifier with a higher min_split_loss value
model = XGBClassifier(min_split_loss=0.5, eval_metric='logloss')

# Fit the model
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)

As discussed in the tip on configuring the gamma parameter, min_split_loss is a regularization term that governs the minimum loss reduction needed for a split to occur. It specifies the minimum improvement in the model’s objective function that a new partition must bring to justify its creation. min_split_loss is a non-negative value, and higher values make the model more conservative.

To recap, the key points when configuring the min_split_loss parameter are:

For practical guidance on choosing the right min_split_loss value, refer to the tip on configuring the gamma parameter.



See Also