XGBoost’s dump_model()
function allows you to export a trained model in a human-readable format, which can be useful for model interpretation and visualization.
Unlike save_model()
, the output from dump_model()
is not designed to be loaded back into XGBoost, but rather to provide insight into the model’s structure and learned features.
Here’s an example of how to use dump_model()
to export a trained XGBoost model in both text and JSON formats:
from sklearn.datasets import make_classification
from xgboost import XGBClassifier
# Generate a small synthetic dataset
X, y = make_classification(n_samples=100, n_classes=2, n_features=5, random_state=42)
# Train an XGBoost classifier
model = XGBClassifier(n_estimators=10, max_depth=2, random_state=42)
model.fit(X, y)
# Dump the model in text format
model.get_booster().dump_model('model.txt', fmap='', with_stats=True, dump_format='text')
print("Text Model:")
with open('model.txt', 'r') as file:
for line in file:
print(line)
In this example, we first generate a small synthetic dataset using scikit-learn’s make_classification
function. We then train a simple XGBoost classifier with 10 trees and a maximum depth of 2.
To dump the trained model in text format, we call dump_model()
on the model’s booster with dump_format='text'
. The with_stats=True
argument includes additional statistics in the output.
We then load the model saved to file and print out each tree in the dumped text model.
The text output represents the model’s trees, with each line representing a node. The output includes the feature, threshold, gain, and other statistics for each node. This format can be useful for understanding how the model makes decisions.
It’s important to note that models dumped with dump_model()
cannot be loaded back into XGBoost. If you need to save a model for later use, consider using save_model()
instead, which allows you to save the model in a format that can be loaded and used for making predictions.