Column Transformer for Faster Feature Engineering in Machine Learning

Data pre-processing techniques

4 min readNov 21, 2023

Photo by Caspar Camille Rubin on Unsplash

Data is very important and a need for predictive modeling. Data goes through various pre-processing techniques before being feed into machine learning model. Feature engineering is a vital part of data pre-processing that handles missing values, data scaling, data encoding from string categories to numerical, and other techniques.

Each column in the data can have a different problem and it can be handled using pre-processing techniques. The main issue in data pre-processing is to handle each column.

Suppose we need to use one hot encoding, imputation, ordinal encoding, and others. These different process creates arrays of each column and then need to concatenate all these arrays to make one big array. So, this type of approach is not so much efficient and faster.

The column transformer is a way to handle all these trouble columns in a single process.

Python Example:

import numpy as np
import pandas as pd

from sklearn.impute import SimpleImputer
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import OrdinalEncoder

df1 = pd.read_csv('demo.csv')
df1

Column Transformer for Faster Feature Engineering in Machine Learning

Data pre-processing techniques

Written by Amit Chauhan