The “aft-nloglik” is an evaluation metric used in XGBoost for survival analysis models. It is the negative log-likelihood of the Accelerated Failure Time (AFT) model, which assumes that the survival time follows a certain distribution (e.g., normal, logistic, or extreme value) after a transformation.
This metric should be used when the XGBoost model is configured with the “survival:aft” objective, which is designed for survival time prediction. Using “aft-nloglik” allows you to assess how well the model fits the observed survival data.
Here’s an example of how to use the “aft-nloglik” metric in XGBoost’s native Python API:
import numpy as np
from sklearn.model_selection import train_test_split
import xgboost as xgb
# Generate a synthetic survival dataset
np.random.seed(42)
n_samples = 1000
X = np.random.rand(n_samples, 5)
true_coef = np.array([1, -1, 2, -2, 1])
y = np.exp(-(X @ true_coef + np.random.normal(size=n_samples)))
# Create lower and upper bounds, here they are the same as y because there is no censoring
y_lower = y_upper = y
# Split the data into training and testing sets
X_train, X_test, y_train, y_test, y_lower_train, y_lower_test, y_upper_train, y_upper_test = train_test_split(X, y, y_lower, y_upper, test_size=0.2, random_state=42)
# Convert data into DMatrix, specifying the label, label_lower_bound, and label_upper_bound
dtrain = xgb.DMatrix(X_train, label=y_train, label_lower_bound=y_lower_train, label_upper_bound=y_upper_train)
dtest = xgb.DMatrix(X_test, label=y_test, label_lower_bound=y_lower_test, label_upper_bound=y_upper_test)
# Configure XGBoost parameters
params = {
'objective': 'survival:aft',
'eval_metric': 'aft-nloglik',
'aft_loss_distribution': 'normal',
'aft_loss_distribution_scale': 1.0,
'tree_method': 'hist',
'learning_rate': 0.05,
'max_depth': 3,
'n_estimators': 100,
}
# Train the model
model = xgb.train(params, dtrain, num_boost_round=100, evals=[(dtest, 'test')])
# Make predictions on test set
y_pred = model.predict(dtest)
# Print evaluation metric on test set
print(f"Test aft-nloglik: {model.eval(dtest, 'test').split(':')[-1]}")
In this example, we first generate a synthetic survival dataset where the survival time follows an exponential distribution based on a linear combination of the features. We then split the data into train and test sets, convert them to DMatrix format, and configure the XGBoost parameters.
Note how we specify 'survival:aft'
as the objective and 'aft-nloglik'
as the evaluation metric. We also set the 'aft_loss_distribution'
to 'normal'
and its scale parameter to 1.0.
During training, we pass evals=[(dtest, 'test')]
to xgb.train()
so that the “aft-nloglik” is calculated on the test set at each iteration. After training, we make predictions on the test set and print the final “aft-nloglik” value.
The printed “aft-nloglik” is a negative value, and higher values (closer to zero) indicate a better fit. However, the absolute value is less important than the relative values when comparing different models.
Some tips when using “aft-nloglik”:
- While “aft-nloglik” is a useful metric for survival analysis, you may also want to consider other metrics like the concordance index (C-index) to assess the model’s ranking performance.
- Be cautious when interpreting “aft-nloglik” values across different datasets, as the metric is sensitive to the scale of the survival times.