Numerical data can be scaled to ensure proportionate influence on the prediction
Common techniques for scaling
So how do we do it, exactly? How can we align different features into the same scale?
Keep in mind that not all ML algorithms will be sensitive to different scales of inputted features. Here is a collection of commonly used scaling and normalizing transformations that we usually use for data science and ML projects:
- Mean/variance standardization
- MinMax scaling
- Maxabs scaling
- Robust scaling
- Normalizer
In this example, you have the one column called home Type, and three different levels: House, Apartment, and Condo. The data frame has five observations for that particular feature.
With one-hot encoding, you convert this one column of home Type into three columns: a column for House, a column for Apartment, and a column for Condo. You encode each observation with either a 1 or 0: 1 to indicate the home type of that particular observation, or 0 for the other options.
============================================================
Topics related to this subdomain
Here are some topics you may want to study for more in-depth information related to this subdomain:
Scaling
Normalizing
Dimensionality reduction
Date formatting
One-hot encoding