Categorical Variables and Cardinality
Mohammed Galalen
Posted on August 16, 2019
One approach to deal with categorical variables is One-Hot Encoding.
But there is an important thing to keep in mind when using it.
Cardinality and it means the number of unique values in a column.
It's always better to create a one-hot encoding for columns with lower cardinality because in a large dataset one-hot encoding can expand the size of the dataset.
High cardinality columns can either be dropped from the dataset, or we can use label encoding.
💖 💪 🙅 🚩
Mohammed Galalen
Posted on August 16, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.