Member-only story

Scaling of Data in Pandas for Data Analysis

Normalizing and standard scaling of data

Amit Chauhan
6 min readAug 17, 2022
Photo by Farzad on Unsplash

Doesn’t your work become easy when your data is presented in a uniform and visually sorted way? Well, life does not work that way….so, data normalization is one such method that ensures audited, formatted, and uniform data. In other words, data normalization is a frequent practice used in machine learning to transform any kind of data from numeric to columns having a standard scale. The reason we do this is that sometimes in ML, some of the features may differ a lot in terms of values. So before running these ML algorithms, it is necessary to normalize them.

Let us look at different ways to normalize a data frame:

Pandas Normalize using Mean Normalization

Mean normalization is a way to implement feature scaling. The simplest way to normalize all columns of a pandas DataFrame is by subtracting the mean and dividing it by the standard deviation.

Program:

import pandas as pdCompanies= pd.DataFrame({“No”: [1000, 2000, 3000], “Yes”: [400, 500,
600]})
df= pd.DataFrame(Companies)
print(df)

# method 1

df_normalized =…

--

--

Amit Chauhan
Amit Chauhan

Written by Amit Chauhan

Data Scientist, AI/ML/DL, Azure Cloud

No responses yet