How to Use XGBoost XGBRanker

The xgboost.XGBRanker class is designed to provide a way to train XGBoost models for learning to rank tasks within the scikit-learn framework.

However, learning to rank is not fully supported in scikit-learn and it can be tricky to use the XGBRanker class.

In this example, we’ll fit a XGBRanker model using the scikit-learn API.

import numpy as np
from xgboost import XGBRanker

# Generate synthetic data
np.random.seed(42)
n_samples = 100
n_features = 10

# Generate feature data
X = np.random.rand(n_samples, n_features)

# Generate group sizes for ranking that sum up to n_samples
group_sizes = np.random.randint(1, 10, size=15)
group_sizes[-1] = n_samples - group_sizes[:-1].sum()

# Ensure group sizes sum to n_samples
assert group_sizes.sum() == n_samples

# Generate labels (targets) for ranking
y = np.random.rand(n_samples)

# Create an instance of the XGBRanker
model = XGBRanker(
    objective='rank:pairwise',
    learning_rate=0.1,
    max_depth=6,
    n_estimators=100
)

# Fit the model
model.fit(X, y, group=group_sizes.tolist())

# Make predictions
predictions = model.predict(X)

# Print the first 10 predictions
print(predictions[:10])

In this example, we first generate a synthetic learning to rank dataset using NumPy random numbers.

Next, we define an XGBRanker model with the 'rank:pairwise' objective and other standard parameters.

We then fit the model and specify the size of each query group as a list provided to the groups parameter.

By following this example, you can see a minimal example of using the xgboost.XGBRanker class for learning to rank within scikit-learn.

See Also