XGBoost offers native support for multiple output regression (multi-out regression) tasks through the use of the tree_method="hist"
and multi_strategy="multi_output_tree"
parameters.
By leveraging these parameters, you can efficiently train an XGBoost model to predict multiple continuous target variables simultaneously without relying on external wrappers like scikit-learn’s MultiOutputRegressor
.
This example demonstrates how to train an XGBoost model for multiple output regression using the native support provided by XGBoost. We’ll generate a synthetic dataset, prepare the data, initialize the model with the appropriate parameters, train it, and evaluate its performance.
# XGBoosting.com
# Train an XGBoost Model for Multiple Output Regression with Native Support
from xgboost import XGBRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
# Generate a synthetic multi-output regression dataset
X, y = make_regression(n_samples=1000,
n_features=10,
n_targets=3,
noise=0.1,
random_state=42,
n_informative=5)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize an XGBRegressor model with native multi-output support
model = XGBRegressor(n_estimators=100,
learning_rate=0.1,
tree_method="hist",
multi_strategy="multi_output_tree",
random_state=42)
# Fit the XGBRegressor on the training data
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Evaluate the model's performance using mean squared error
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")
Here’s how it works:
- Generate a synthetic multi-output regression dataset with 10 input features and 3 output targets.
- Split the data into training and testing sets using
train_test_split
. - Initialize an
XGBRegressor
model withtree_method="hist"
andmulti_strategy="multi_output_tree"
for native multi-output support. - Fit the
XGBRegressor
on the training data usingfit()
. - Make predictions on the test set using
predict()
. - Evaluate the model’s performance using Mean Squared Error (MSE).
By setting tree_method="hist"
and multi_strategy="multi_output_tree"
, XGBoost internally handles the multiple output regression task without the need for external wrappers. This approach can provide improved performance and simplify the code required for training and prediction.
This example serves as a starting point for training XGBoost models for multi-output regression using native support. Depending on your specific dataset and requirements, you may need to preprocess the data, tune hyperparameters, or use different evaluation metrics to achieve optimal results.