XGBoost Configure "aft-nloglik" Eval Metric

The “aft-nloglik” is an evaluation metric used in XGBoost for survival analysis models. It is the negative log-likelihood of the Accelerated Failure Time (AFT) model, which assumes that the survival time follows a certain distribution (e.g., normal, logistic, or extreme value) after a transformation.

This metric should be used when the XGBoost model is configured with the “survival:aft” objective, which is designed for survival time prediction. Using “aft-nloglik” allows you to assess how well the model fits the observed survival data.

Here’s an example of how to use the “aft-nloglik” metric in XGBoost’s native Python API:

import numpy as np
from sklearn.model_selection import train_test_split
import xgboost as xgb

# Generate a synthetic survival dataset
np.random.seed(42)
n_samples = 1000
X = np.random.rand(n_samples, 5)
true_coef = np.array([1, -1, 2, -2, 1])
y = np.exp(-(X @ true_coef + np.random.normal(size=n_samples)))

# Create lower and upper bounds, here they are the same as y because there is no censoring
y_lower = y_upper = y

# Split the data into training and testing sets
X_train, X_test, y_train, y_test, y_lower_train, y_lower_test, y_upper_train, y_upper_test = train_test_split(X, y, y_lower, y_upper, test_size=0.2, random_state=42)

# Convert data into DMatrix, specifying the label, label_lower_bound, and label_upper_bound
dtrain = xgb.DMatrix(X_train, label=y_train, label_lower_bound=y_lower_train, label_upper_bound=y_upper_train)
dtest = xgb.DMatrix(X_test, label=y_test, label_lower_bound=y_lower_test, label_upper_bound=y_upper_test)


# Configure XGBoost parameters
params = {
    'objective': 'survival:aft',
    'eval_metric': 'aft-nloglik',
    'aft_loss_distribution': 'normal',
    'aft_loss_distribution_scale': 1.0,
    'tree_method': 'hist',
    'learning_rate': 0.05,
    'max_depth': 3,
    'n_estimators': 100,
}

# Train the model
model = xgb.train(params, dtrain, num_boost_round=100, evals=[(dtest, 'test')])

# Make predictions on test set
y_pred = model.predict(dtest)

# Print evaluation metric on test set
print(f"Test aft-nloglik: {model.eval(dtest, 'test').split(':')[-1]}")

In this example, we first generate a synthetic survival dataset where the survival time follows an exponential distribution based on a linear combination of the features. We then split the data into train and test sets, convert them to DMatrix format, and configure the XGBoost parameters.

Note how we specify 'survival:aft' as the objective and 'aft-nloglik' as the evaluation metric. We also set the 'aft_loss_distribution' to 'normal' and its scale parameter to 1.0.

During training, we pass evals=[(dtest, 'test')] to xgb.train() so that the “aft-nloglik” is calculated on the test set at each iteration. After training, we make predictions on the test set and print the final “aft-nloglik” value.

The printed “aft-nloglik” is a negative value, and higher values (closer to zero) indicate a better fit. However, the absolute value is less important than the relative values when comparing different models.

Some tips when using “aft-nloglik”:

While “aft-nloglik” is a useful metric for survival analysis, you may also want to consider other metrics like the concordance index (C-index) to assess the model’s ranking performance.
Be cautious when interpreting “aft-nloglik” values across different datasets, as the metric is sensitive to the scale of the survival times.

See Also