Categorical Variables and Cardinality

mgalalen

Mohammed Galalen

Posted on August 16, 2019

Categorical Variables and Cardinality

One approach to deal with categorical variables is One-Hot Encoding.
But there is an important thing to keep in mind when using it.
Cardinality and it means the number of unique values in a column.

It's always better to create a one-hot encoding for columns with lower cardinality because in a large dataset one-hot encoding can expand the size of the dataset.

High cardinality columns can either be dropped from the dataset, or we can use label encoding.

Alt Text

💖 💪 🙅 🚩
mgalalen
Mohammed Galalen

Posted on August 16, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Categorical Variables and Cardinality
machinelearning Categorical Variables and Cardinality

August 16, 2019