XGBoosting Home | About | Contact | Examples

XGBoost Save Model with save_model()

The XGBoost save_model() function allows you to save trained models to a file for later use.

You can save models in either a text (JSON) or binary (Binary JSON, called UBJ) format.

Here’s an example of how to save and load XGBoost models in both formats:

from sklearn.datasets import make_classification
from xgboost import XGBClassifier

# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42)

# Train an XGBoost classifier
model = XGBClassifier(n_estimators=100, learning_rate=0.1, random_state=42)
model.fit(X, y)

# Save the model in binary JSON format
model.save_model('model.ubj')

# Save the model in JSON format
model.save_model('model.json')

# Load the binary model
binary_model = XGBClassifier()
binary_model.load_model('model.ubj')

# Load the JSON model
json_model = XGBClassifier()
json_model.load_model('model.json')

# Make predictions with both models
binary_predictions = binary_model.predict(X)
json_predictions = json_model.predict(X)

In this example, we first generate a synthetic dataset using scikit-learn’s make_classification function. We then train an XGBoost classifier on this dataset.

Next, we save the trained model in both binary JSON (UBJ) and JSON formats. To save in binary format, we simply call save_model() with a filename ending in ‘.ubj’. To save in JSON format we use a filename ending in ‘.json’.

To load the models, we create new instances of the XGBClassifier class and call load_model() with the appropriate filename. Finally, we make predictions with both loaded models to verify that they were saved and loaded correctly.

Note that while the binary format is more efficient, it may not be compatible across different versions of XGBoost or programming languages. The JSON format is more portable and can be useful when you need to use the model with other tools or languages.



See Also