Train an XGBoost Model on a NumPy Array

When your data is already stored in NumPy arrays, you can directly use it to train an XGBoost model.

Simply pass your feature matrix X and target vector y to the fit() method of your XGBoost model.

from xgboost import XGBClassifier
import numpy as np

# Assuming X and y are NumPy arrays
X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
y = np.array([0, 1, 1])

# Initialize and train the model
model = XGBClassifier(n_estimators=100, learning_rate=0.1, random_state=42)
model.fit(X, y)

Here’s what’s happening:

We assume that our feature matrix X and target vector y are already stored as NumPy arrays. Remember to ensure that X and y have compatible dimensions before proceeding.
We create an instance of the XGBClassifier (or XGBRegressor for regression tasks) and specify our desired hyperparameters.
We directly pass X and y to the fit() method. XGBoost will use these NumPy arrays during training without any need for conversion.

NumPy arrays are a natural fit for XGBoost, as the library is designed to work efficiently with this data format. By using NumPy arrays, you can avoid the overhead of converting your data to a different format before training your model.

See Also