The gamma
and min_split_loss
parameters in XGBoost control the minimum loss reduction required to make a further partition on a leaf node of the tree.
A larger value makes the algorithm more conservative in creating new splits. The gamma
parameter is preferred in the native XGBoost API, while min_split_loss
is used in the scikit-learn API, conforming to the scikit-learn convention.
This example demonstrates how to use both parameters and confirms that they have the same effect on the model’s performance.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)
# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create two XGBoost classifiers, one using "gamma" and the other using "min_split_loss"
model_gamma = XGBClassifier(gamma=0.5, eval_metric='logloss')
model_min_split_loss = XGBClassifier(min_split_loss=0.5, eval_metric='logloss')
# Train both models on the training set
model_gamma.fit(X_train, y_train)
model_min_split_loss.fit(X_train, y_train)
# Make predictions on the test set
predictions_gamma = model_gamma.predict(X_test)
predictions_min_split_loss = model_min_split_loss.predict(X_test)
# Compare the results
assert (predictions_gamma == predictions_min_split_loss).all()
The example below demonstrates the same functionality using the native XGBoost API with DMatrix:
import xgboost as xgb
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)
# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Convert data to DMatrix
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
# Set up parameters for XGBoost
params_gamma = {
'objective': 'binary:logistic',
'eval_metric': 'logloss',
'gamma': 0.5
}
params_min_split_loss = {
'objective': 'binary:logistic',
'eval_metric': 'logloss',
'min_split_loss': 0.5
}
# Train the models
model_gamma = xgb.train(params_gamma, dtrain, num_boost_round=10)
model_min_split_loss = xgb.train(params_min_split_loss, dtrain, num_boost_round=10)
# Make predictions on the test set
predictions_gamma = model_gamma.predict(dtest).round()
predictions_min_split_loss = model_min_split_loss.predict(dtest).round()
# Compare the results
assert (predictions_gamma == predictions_min_split_loss).all()
In both examples, the models are trained with a gamma
or min_split_loss
value of 0.5.
Increasing this value will make the algorithm more conservative in creating new splits, as it requires a larger loss reduction to consider a split beneficial.
The choice between gamma
and min_split_loss
ultimately depends on the API being used and personal preference.