Configure XGBoost "eta" Parameter

Parameters

The eta parameter in XGBoost controls the learning rate, which determines the step size at each boosting iteration.

Adjusting eta can significantly impact model performance and training time.

import xgboost as xgb
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# Generate synthetic data
X, y = make_classification(n_samples=1000, n_features=20, n_informative=2, n_redundant=10, random_state=42)

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create DMatrix objects for the train and test sets
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

# Define XGBoost training parameters
params = {
    'eta': 0.01,  # learning rate
    'objective': 'binary:logistic',  # binary classification
    'eval_metric': 'logloss'  # evaluation metric
}

# Train the model
num_boost_round = 100  # number of boosting rounds
bst = xgb.train(params, dtrain, num_boost_round)

# Make predictions
predictions = bst.predict(dtest)

When using the scikit-learn API, the eta parameter is called learning_rate.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier

# Generate synthetic data
X, y = make_classification(n_samples=1000, n_features=20, n_informative=2, n_redundant=10, random_state=42)

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the XGBoost classifier with a lower learning rate (eta)
model = XGBClassifier(eta=0.01, eval_metric='logloss')

# Fit the model
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)

Understanding the “eta” Parameter

The eta parameter, also known as the learning rate in other gradient boosting frameworks, is a crucial setting that influences the speed and quality of learning in XGBoost.

It determines the contribution of each tree to the final outcome by scaling the weights of the features.

The valid range for eta is between 0 and 1, inclusive.
The default value of eta in the XGBoost native API is 0.3.
The default value for learning_rate in the random forest models XGBRFRegressor and XGBRFClassifier is 1.0.

Choosing the Right “eta” Value

The value of eta can significantly affect the model’s performance:

Higher eta values (closer to 1) lead to faster learning but may result in suboptimal solutions due to overshooting the optimal weights.
Lower eta values (closer to 0) slow down the learning process but can lead to better generalization and reduced overfitting by allowing more precise weight adjustments.

When setting eta, consider the trade-off with the number of boosting rounds:

With a lower eta, more boosting rounds are typically required to achieve optimal performance, increasing training time.
Higher eta values can converge faster but may require careful tuning of other parameters to prevent overfitting.

Practical Tips

Start with the default eta value and adjust as needed based on the model’s performance.
A common practice is to use a lower eta (e.g., 0.01 or 0.1) and increase the number of boosting rounds to achieve better results.
Use cross-validation to find the optimal combination of eta and the number of boosting rounds for your specific dataset and problem.
Keep in mind that eta interacts with other regularization parameters, such as max_depth and min_child_weight. Tuning these parameters together can help achieve the best performance.

Understanding the “eta” Parameter

Choosing the Right “eta” Value

Practical Tips

See Also