The XGBoost native API provides a straightforward and effective way to handle model saving and loading, offering a robust alternative to more general serialization methods like Python’s pickle module.
This tip demonstrates how to save and load XGBoost models using the native save_model
and load_model
methods. These methods are optimized for XGBoost, ensuring the model’s integrity and performance are maintained perfectly when reloaded.
import xgboost as xgb
from sklearn.datasets import load_iris
# Load data
X, y = load_iris(return_X_y=True)
dtrain = xgb.DMatrix(X, label=y)
# Train the model
params = {'max_depth':3, 'eta':1, 'objective':'multi:softprob', 'num_class':3}
bst = xgb.train(params, dtrain, 10)
# Save the model to a json file
bst.save_model('0001.json')
# Load the model from the file
loaded_bst = xgb.Booster()
loaded_bst.load_model('0001.json')
# Prepare new data for prediction
dpredict = xgb.DMatrix(X)
# Predict using the loaded model
predictions = loaded_bst.predict(dpredict)
- We start by training an XGBoost model on the iris dataset, which is a simple multiclass classification problem.
- We use
bst.save_model('0001.json')
to save the trained model. This method saves the model in a JSON format, which is optimized for XGBoost, ensuring that all configurations and learning outcomes are preserved. - The model is reloaded using
loaded_bst.load_model('0001.json')
. This ensures that the model we use for predictions is exactly the same as the one we saved. - Finally, we demonstrate that the model, once loaded, functions as expected by making predictions on the same dataset.
Using the XGBoost’s native save_model
and load_model
ensures that the model’s efficiency and accuracy are retained, making this method preferable for production environments where consistency is critical.