XGBoosting Home | About | Contact | Examples

XGBoost Compare "learning_rate" vs "eta" Parameters

Both the learning_rate and eta parameters in XGBoost control the step size at each boosting iteration, determining the contribution of each tree to the final prediction.

The eta parameter is preferred in the native XGBoost API, while learning_rate is used in the scikit-learn API.

This example demonstrates how to use both parameters and confirms that they have the same effect on the model’s performance.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier

# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create two XGBoost classifiers, one using "learning_rate" and the other using "eta"
model_learning_rate = XGBClassifier(learning_rate=0.1, eval_metric='logloss')
model_eta = XGBClassifier(eta=0.1, eval_metric='logloss')

# Train both models on the training set
model_learning_rate.fit(X_train, y_train)
model_eta.fit(X_train, y_train)

# Make predictions on the test set
predictions_learning_rate = model_learning_rate.predict(X_test)
predictions_eta = model_eta.predict(X_test)

# Compare the results
assert (predictions_learning_rate == predictions_eta).all()

The example below demonstrates the same functionality using the native XGBoost API with DMatrix:

import xgboost as xgb
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Convert data to DMatrix
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

# Set up parameters for XGBoost
params_learning_rate = {
    'objective': 'binary:logistic',
    'eval_metric': 'logloss',
    'learning_rate': 0.1
}

params_eta = {
    'objective': 'binary:logistic',
    'eval_metric': 'logloss',
    'eta': 0.1
}

# Train the models
model_learning_rate = xgb.train(params_learning_rate, dtrain, num_boost_round=10)
model_eta = xgb.train(params_eta, dtrain, num_boost_round=10)

# Make predictions on the test set
predictions_learning_rate = model_learning_rate.predict(dtest).round()
predictions_eta = model_eta.predict(dtest).round()

# Compare the results
assert (predictions_learning_rate == predictions_eta).all()

The learning_rate and eta parameters serve the same purpose in XGBoost, controlling the step size at each boosting iteration. A smaller value (e.g., 0.1) will make the model more conservative, while a larger value (e.g., 0.3) will make the model more aggressive.

The main difference between the two is the API in which they are used. The eta parameter is used in the native XGBoost API, while learning_rate is used in the scikit-learn API, conforming to the scikit-learn convention.

When working with XGBoost, it is recommended to use eta when using the native XGBoost API and learning_rate when using the scikit-learn API. The choice between eta and learning_rate ultimately depends on the API being used and personal preference.



See Also