Configure XGBoost L2 Regularization

L2 regularization, or Ridge, is a technique used to prevent overfitting in XGBoost models.

It adds a penalty term to the objective function proportional to the square of the coefficients’ magnitudes.

Configuring L2 regularization in XGBoost involves setting the lambda hyperparameter to a non-zero value. In the scikit-learn API, this parameter is reg_lambda.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from xgboost import XGBRegressor

# Create a synthetic dataset
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the XGBoost regressor with L2 regularization (lambda) 
xgb_model = XGBRegressor(objective='reg:squarederror', reg_lambda=1.0, n_estimators=100)

# Train the model
xgb_model.fit(X_train, y_train)

# Predict on the test set
y_pred = xgb_model.predict(X_test)

L2 regularization, or Ridge, is a regularization technique that adds a penalty term to the objective function proportional to the square of the coefficients’ magnitudes. The purpose of L2 regularization is to prevent overfitting by encouraging smaller but non-zero coefficients.

The strength of L2 regularization in XGBoost is controlled by the lambda hyperparameter. Higher values of lambda imply stronger regularization, leading to more coefficient shrinkage. When configuring L2 regularization, it’s recommended to start with a small value of lambda (e.g., 0.1) and tune it based on the model’s performance on a validation set.

XGBoost also supports L1 regularization (Lasso), controlled by the alpha hyperparameter. In practice, it’s common to use a combination of L1 and L2 regularization to balance between feature selection and coefficient shrinkage.

L2 regularization is particularly useful when dealing with high-dimensional datasets or when coefficient shrinkage is desired to improve model generalization. By encouraging smaller coefficients, L2 regularization helps reduce the model’s sensitivity to individual features and promotes a more stable and robust model.

See Also