The xgboost.XGBRanker
class is designed to provide a way to train XGBoost models for learning to rank tasks within the scikit-learn framework.
However, learning to rank is not fully supported in scikit-learn and it can be tricky to use the XGBRanker
class.
In this example, we’ll fit a XGBRanker
model using the scikit-learn API.
import numpy as np
from xgboost import XGBRanker
# Generate synthetic data
np.random.seed(42)
n_samples = 100
n_features = 10
# Generate feature data
X = np.random.rand(n_samples, n_features)
# Generate group sizes for ranking that sum up to n_samples
group_sizes = np.random.randint(1, 10, size=15)
group_sizes[-1] = n_samples - group_sizes[:-1].sum()
# Ensure group sizes sum to n_samples
assert group_sizes.sum() == n_samples
# Generate labels (targets) for ranking
y = np.random.rand(n_samples)
# Create an instance of the XGBRanker
model = XGBRanker(
objective='rank:pairwise',
learning_rate=0.1,
max_depth=6,
n_estimators=100
)
# Fit the model
model.fit(X, y, group=group_sizes.tolist())
# Make predictions
predictions = model.predict(X)
# Print the first 10 predictions
print(predictions[:10])
In this example, we first generate a synthetic learning to rank dataset using NumPy random numbers.
Next, we define an XGBRanker
model with the 'rank:pairwise'
objective and other standard parameters.
We then fit the model and specify the size of each query group as a list provided to the groups
parameter.
By following this example, you can see a minimal example of using the xgboost.XGBRanker
class for learning to rank within scikit-learn.