This example demonstrates how to use XGBoost’s support for multiple output regression via multi_strategy='multi_output_tree'
to forecast multiple future time steps of a univariate time series.
We’ll cover data preparation, model training, and making multi-step predictions using a synthetic dataset, highlighting the benefits and use cases of multi-step forecasting.
# XGBoosting.com
# Multi-Step Univariate Time Series Forecasting with XGBoost's multi_output_tree Strategy
import numpy as np
import pandas as pd
from xgboost import XGBRegressor
from sklearn.metrics import mean_squared_error
# Generate a synthetic univariate time series dataset
series = np.sin(0.1 * np.arange(200)) + np.random.randn(200) * 0.1
# Prepare data for supervised learning
df = pd.DataFrame({'series': series})
n_steps = 3
for i in range(1, n_steps + 1):
df[f'lag_{i}'] = df['series'].shift(i)
df['target'] = df[['series']].shift(-n_steps)
df = df.dropna()
X = df.drop(columns=['series', 'target']).values
y = df['target'].values.reshape(-1, 1)
# Chronological split of data into train and test sets
split_index = int(len(X) * 0.8)
X_train, X_test = X[:split_index], X[split_index:]
y_train, y_test = y[:split_index], y[split_index:]
# Initialize an XGBRegressor model with multi_output_tree strategy
model = XGBRegressor(n_estimators=100, learning_rate=0.1,
random_state=42, multi_strategy='multi_output_tree')
# Fit the model on the training data
model.fit(X_train, y_train)
# Make multi-step predictions on the test set
y_pred = model.predict(X_test)
# Evaluate the model's performance
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")
This example showcases how to use XGBoost’s 'multi_output_tree'
strategy for multi-step univariate time series forecasting. Here’s a step-by-step breakdown:
- Generate a synthetic univariate time series using a sine wave with added noise.
- Prepare the data for supervised learning by creating lagged features and a multi-step target. Here, we use 3 lags and predict 3 steps ahead.
- Split the data chronologically into train and test sets to maintain the temporal order.
- Initialize an
XGBRegressor
model with themulti_strategy
set to'multi_output_tree'
- Fit the model on the training data using
fit()
. - Make multi-step predictions on the test set using
predict()
. - Evaluate the model’s performance using Mean Squared Error (MSE).
Multi-step forecasting is useful when you need to predict multiple future time steps at once. This is particularly relevant for applications like inventory management, resource planning, and financial forecasting, where knowing the expected values for several periods ahead can help make informed decisions.
The multi_strategy
in XGBoost allows for efficient multi-step forecasting by adapting the tree structure to directly output multiple future time steps. This approach can capture complex dependencies between the input features and the multi-step target, potentially leading to improved performance compared to iteratively making single-step predictions.
By modifying the data preparation and the number of time steps to forecast, you can adapt this example to various multi-step univariate time series forecasting tasks. Experiment with different hyperparameters and feature engineering techniques to further optimize the model’s performance for your specific use case.