Save XGBoost Model To JSON with the Native API

Save

The XGBoost native API provides a straightforward and effective way to handle model saving and loading, offering a robust alternative to more general serialization methods like Python’s pickle module.

This tip demonstrates how to save and load XGBoost models using the native save_model and load_model methods. These methods are optimized for XGBoost, ensuring the model’s integrity and performance are maintained perfectly when reloaded.

import xgboost as xgb
from sklearn.datasets import load_iris

# Load data
X, y = load_iris(return_X_y=True)
dtrain = xgb.DMatrix(X, label=y)

# Train the model
params = {'max_depth':3, 'eta':1, 'objective':'multi:softprob', 'num_class':3}
bst = xgb.train(params, dtrain, 10)

# Save the model to a json file
bst.save_model('0001.json')

# Load the model from the file
loaded_bst = xgb.Booster()
loaded_bst.load_model('0001.json')

# Prepare new data for prediction
dpredict = xgb.DMatrix(X)
# Predict using the loaded model
predictions = loaded_bst.predict(dpredict)

We start by training an XGBoost model on the iris dataset, which is a simple multiclass classification problem.
We use bst.save_model('0001.json') to save the trained model. This method saves the model in a JSON format, which is optimized for XGBoost, ensuring that all configurations and learning outcomes are preserved.
The model is reloaded using loaded_bst.load_model('0001.json'). This ensures that the model we use for predictions is exactly the same as the one we saved.
Finally, we demonstrate that the model, once loaded, functions as expected by making predictions on the same dataset.

Using the XGBoost’s native save_model and load_model ensures that the model’s efficiency and accuracy are retained, making this method preferable for production environments where consistency is critical.

See Also