XGBoosting Home | About | Contact | Examples

XGBoost for Multi-Step Univariate Time Series Forecasting Manually

This example demonstrates how to train separate XGBoost models for each forecast time step to generate a multi-step univariate time series forecast.

We’ll use a synthetic dataset, prepare the data using a sliding window approach, train an XGBRegressor for each time step, and combine their predictions to obtain the final multi-step forecast.

# XGBoosting.com
# Multi-Step Univariate Time Series Forecasting (Manually)
import numpy as np
import pandas as pd
from xgboost import XGBRegressor
from sklearn.metrics import mean_absolute_error

# Generate a synthetic univariate time series dataset
np.random.seed(42)
time = np.arange(1000)
series = np.sin(2 * np.pi * time / 100) + np.cos(2 * np.pi * time / 200) + np.random.normal(0, 0.1, 1000)

# Define the number of time steps to forecast
forecast_steps = 5

def sliding_window(series, n_lags, n_ahead):
   X, y = [], []
   for i in range(len(series) - n_lags - n_ahead + 1):
       X.append(series[i:(i + n_lags)])
       y.append(series[i + n_lags + n_ahead - 1])
   return np.array(X), np.array(y)

# Prepare data for each forecast time step
n_lags = 10
X_train, y_train, X_test, y_test = [], [], [], []
for step in range(1, forecast_steps + 1):
   X, y = sliding_window(series, n_lags, step)
   split_index = int(len(X) * 0.8)
   X_train.append(X[:split_index])
   y_train.append(y[:split_index])
   X_test.append(X[split_index:])
   y_test.append(y[split_index:])

# Train an XGBRegressor for each forecast time step
models = []
for step in range(forecast_steps):
   model = XGBRegressor(n_estimators=100, learning_rate=0.1, random_state=42)
   model.fit(X_train[step], y_train[step])
   models.append(model)

# Generate multi-step forecast
y_pred = []
last_window = series[-n_lags:]
for model in models:
   pred = model.predict(last_window.reshape(1, -1))
   y_pred.append(pred[0])
   last_window = np.append(last_window[1:], pred[0])

# Evaluate the multi-step forecast
y_true = series[-forecast_steps:]
mae = mean_absolute_error(y_true, y_pred)
print(f"Mean Absolute Error: {mae:.4f}")

In this example, we:

  1. Generate a synthetic univariate time series dataset.
  2. Define the number of time steps to forecast (forecast_steps).
  3. Create a sliding_window function that prepares the data for supervised learning by creating input features (X) and target values (y) using a sliding window approach.
  4. Prepare the data for each forecast time step by calling sliding_window with different target offsets.
  5. Split each prepared dataset into train and test sets.
  6. Initialize an XGBRegressor model for each forecast time step and fit it on the respective training data.
  7. Generate the multi-step forecast by making predictions with each model in sequence, updating the input window with the previous prediction for the next model.
  8. Evaluate the performance of the multi-step forecast using Mean Absolute Error (MAE).

This approach allows for flexibility in training separate models for each forecast time step, which can be beneficial when the relationship between the input and target varies over different time horizons. However, it also requires more computational resources compared to generating all predictions with a single model.



See Also