XGBoost requires numerical input features.

When working with ordered categorical (ordinal) features, they must be converted to numbers before training an XGBoost model. Ordinal encoding maps each unique category to an integer while preserving the order of the categories, if one exists.

scikit-learn’s `OrdinalEncoder`

provides a simple and efficient way to perform this ordinal encoding of categorical features. When combined with `ColumnTransformer`

, it allows for seamless integration of the encoding step into a machine learning pipeline, especially when dealing with datasets containing a mix of categorical and numerical features.

```
from sklearn.preprocessing import OrdinalEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import make_pipeline
from xgboost import XGBRegressor
import numpy as np
# Synthetic feature matrix X with categorical and numerical features
X = np.array([['low', 10.0],
['medium', 20.0],
['high', 30.0],
['low', 15.0],
['high', 25.0],
['medium', 18.0]], dtype=object)
# Explicitly converting the second column to floats
X[:, 1] = X[:, 1].astype(float)
# Example target variable
y = [1.5, 2.3, 3.1, 1.8, 2.9, 2.2]
# Create ColumnTransformer to apply OrdinalEncoder to the first column
transformer = ColumnTransformer(transformers=[
('ordinal', OrdinalEncoder(), [0])],
remainder='passthrough')
# Perform ordinal encoding
X = transformer.fit_transform(X)
# Define the xgboost model configuration
model = XGBRegressor(random_state=42)
# Fit the pipeline
model.fit(X, y)
# New data for prediction
X_new = np.array([['medium', 22],
['low', 12]], dtype=object)
X_new[:, 1] = X_new[:, 1].astype(float)
# Perform ordinal encoding
X_new = transformer.transform(X_new)
# Make predictions using the pipeline
predictions = model.predict(X_new)
print("Predictions:", predictions)
```

Here’s a step-by-step breakdown:

Import the necessary classes:

`OrdinalEncoder`

for encoding the categorical features,`ColumnTransformer`

for applying transformations to specific columns.Create a

`ColumnTransformer`

named`transformer`

. This transformer will apply`OrdinalEncoder`

to the first column (index 0) of the input data, and pass through the second column (index 1) unchanged.The

`transformer`

is then applied to the dataset converting the first column into integer values consistently.Fit the model using the transformed input features

`X`

and target variable`y`

.To make predictions on new data

`X_new`

, simply pass it to the fitted`ColumnTransformer`

’s`transform`

method. The`ColumnTransformer`

will automatically apply the learned transformations to the new data and use the trained model to make predictions.

By incorporating `OrdinalEncoder`

into a `ColumnTransformer`

, you can streamline the process of preparing your data for XGBoost, especially when dealing with datasets containing both categorical and numerical features. This approach ensures that the necessary transformations are applied consistently and efficiently, both during model training and when making predictions on new data.