XGBoost can be used to fit Poisson regression models for predicting count data.
Poisson regression is a generalized linear model that’s useful when the target variable represents counts, such as the number of events occurring in a fixed interval of time.
Here’s a quick example of how to train an XGBoost model for Poisson regression using the scikit-learn API.
# XGBoosting.com
# Fit an XGBoost Model for Poisson Regression using scikit-learn API
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from xgboost import XGBRegressor
import numpy as np
# Generate a synthetic dataset suitable for Poisson regression
X, y = make_regression(n_samples=1000, n_features=5, noise=10, random_state=42)
y = np.random.poisson(np.exp(y / 25))
# Initialize XGBRegressor
model = XGBRegressor(objective='count:poisson', random_state=42)
# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Fit the model on the training data
model.fit(X_train, y_train)
# Make predictions on the test set
predictions = model.predict(X_test)
print(predictions[:5])
The key steps:
- Initialize an
XGBRegressor
with the appropriateobjective
(here,'count:poisson'
for Poisson regression). - Split your data into training and testing sets.
- Fit the model on the training data using
fit()
. - Make predictions on the test set using
predict()
.