XGBoost Compare "gamma" vs "min_split_loss" Parameters

Parameters

The gamma and min_split_loss parameters in XGBoost control the minimum loss reduction required to make a further partition on a leaf node of the tree.

A larger value makes the algorithm more conservative in creating new splits. The gamma parameter is preferred in the native XGBoost API, while min_split_loss is used in the scikit-learn API, conforming to the scikit-learn convention.

This example demonstrates how to use both parameters and confirms that they have the same effect on the model’s performance.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier

# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create two XGBoost classifiers, one using "gamma" and the other using "min_split_loss"
model_gamma = XGBClassifier(gamma=0.5, eval_metric='logloss')
model_min_split_loss = XGBClassifier(min_split_loss=0.5, eval_metric='logloss')

# Train both models on the training set
model_gamma.fit(X_train, y_train)
model_min_split_loss.fit(X_train, y_train)

# Make predictions on the test set
predictions_gamma = model_gamma.predict(X_test)
predictions_min_split_loss = model_min_split_loss.predict(X_test)

# Compare the results
assert (predictions_gamma == predictions_min_split_loss).all()

The example below demonstrates the same functionality using the native XGBoost API with DMatrix:

import xgboost as xgb
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Convert data to DMatrix
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

# Set up parameters for XGBoost
params_gamma = {
    'objective': 'binary:logistic',
    'eval_metric': 'logloss',
    'gamma': 0.5
}

params_min_split_loss = {
    'objective': 'binary:logistic',
    'eval_metric': 'logloss',
    'min_split_loss': 0.5
}

# Train the models
model_gamma = xgb.train(params_gamma, dtrain, num_boost_round=10)
model_min_split_loss = xgb.train(params_min_split_loss, dtrain, num_boost_round=10)

# Make predictions on the test set
predictions_gamma = model_gamma.predict(dtest).round()
predictions_min_split_loss = model_min_split_loss.predict(dtest).round()

# Compare the results
assert (predictions_gamma == predictions_min_split_loss).all()

In both examples, the models are trained with a gamma or min_split_loss value of 0.5.

Increasing this value will make the algorithm more conservative in creating new splits, as it requires a larger loss reduction to consider a split beneficial.

The choice between gamma and min_split_loss ultimately depends on the API being used and personal preference.

See Also