While Python’s pickle module is a popular choice for saving and loading models, the joblib package offers a more efficient alternative, especially for large numpy arrays often used in XGBoost.
from joblib import dump, load
from sklearn.datasets import make_classification
from xgboost import XGBClassifier
# Generate a random classification dataset
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42)
# Train an XGBoost model
model = XGBClassifier(n_estimators=100, learning_rate=0.1, random_state=42)
model.fit(X, y)
# Save model to file using joblib
dump(model, 'xgb_model.joblib')
# Load model from file
loaded_model = load('xgb_model.joblib')
# Use the loaded model for predictions
predictions = loaded_model.predict(X)
Here’s the breakdown:
- We train an XGBoost classifier on the dataset.
- We save the trained model to a file named ‘xgb_model.joblib’ using
joblib.dump()
. - Later, we load the model from the file using
joblib.load()
. - We then use the loaded model to make predictions, just as we would with the original model.