Configure XGBoost Dart "skip_drop" Parameter

The skip_drop parameter is an important setting when using the XGBoost Dart booster, which can be specified by setting booster='dart'.

This parameter controls the probability of skipping the dropout procedure during a boosting iteration. When a dropout is skipped, new trees are added in the same manner as the gbtree booster. Notably, a non-zero skip_drop value takes precedence over the rate_drop and one_drop parameters.

The skip_drop parameter takes values between 0.0 and 1.0, where 0.0 means the dropout procedure is never skipped (default behavior), and 1.0 means the dropout procedure is always skipped, effectively disabling the Dart booster’s dropout mechanism. Typical values for skip_drop range from 0.0 to 0.5.

Here’s an example demonstrating how to set the skip_drop parameter and its effect on model performance:

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score

# Generate a synthetic classification dataset
X, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=42)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize an XGBClassifier with dart booster and skip_drop=0.0 (default)
clf_no_skip = XGBClassifier(booster='dart', max_depth=5, learning_rate=0.1, n_estimators=100,
                            rate_drop=0.2, one_drop=True, skip_drop=0.0, random_state=42)

# Initialize an XGBClassifier with dart booster and skip_drop=0.5
clf_with_skip = XGBClassifier(booster='dart', max_depth=5, learning_rate=0.1, n_estimators=100,
                              rate_drop=0.2, one_drop=True, skip_drop=0.5, random_state=42)

# Train the models
clf_no_skip.fit(X_train, y_train)
clf_with_skip.fit(X_train, y_train)

# Make predictions on the test set
pred_no_skip = clf_no_skip.predict(X_test)
pred_with_skip = clf_with_skip.predict(X_test)

# Evaluate the models
accuracy_no_skip = accuracy_score(y_test, pred_no_skip)
accuracy_with_skip = accuracy_score(y_test, pred_with_skip)

print(f"Accuracy (skip_drop=0.0): {accuracy_no_skip:.4f}")
print(f"Accuracy (skip_drop=0.5): {accuracy_with_skip:.4f}")

In this example, we generate a synthetic binary classification dataset and split it into training and testing sets. We then initialize two XGBClassifier instances with the Dart booster, one with skip_drop=0.0 (default, no skipping dropout) and another with skip_drop=0.5 (50% chance of skipping dropout). Both models have non-zero values for rate_drop and one_drop to showcase the priority of skip_drop.

After training both models, we make predictions on the test set and evaluate their accuracies. The output will show the difference in performance between the model without skipping dropout and the model with a 50% chance of skipping dropout.

The results demonstrate how the skip_drop parameter influences the model’s behavior and performance. When skip_drop is set to a non-zero value, it overrides the effect of rate_drop and one_drop, allowing the model to occasionally bypass the dropout procedure and add trees as in the gbtree booster. This can help strike a balance between the regularization benefits of dropout and the potential underfitting caused by excessive dropout.

When using the XGBoost Dart booster, it’s recommended to experiment with different values of skip_drop in conjunction with rate_drop and one_drop to find the optimal configuration for your specific dataset and problem. Keep in mind that setting skip_drop too high may diminish the regularization effects of the Dart booster, while setting it too low may not provide sufficient control over the dropout behavior.

See Also