Database Normalization simplified

muhammad_salem

Muhammad Salem

Posted on July 9, 2024

Database Normalization simplified

Applying the normalization rules
You can apply the data normalization rules (sometimes just called normalization rules) as the next step in your design. You use these rules to see if your tables are structured correctly. The process of applying the rules to your database design is called normalizing the database, or just normalization.

Normalization is most useful after you have represented all of the information items and have arrived at a preliminary design. The idea is to help you ensure that you have divided your information items into the appropriate tables. What normalization cannot do is ensure that you have all the correct data items to begin with.

You apply the rules in succession, at each step ensuring that your design arrives at one of what is known as the "normal forms." Five normal forms are widely accepted — the first normal form through the fifth normal form. This article expands on the first three, because they are all that is required for the majority of database designs.

Sure! Let's walk through the normalization process step by step, up to the third normal form (3NF), using clear examples to illustrate each step.

First Normal Form (1NF)

Definition: A table is in the first normal form if:

  1. All the columns contain atomic (indivisible) values.
  2. Each column contains values of a single type.
  3. Each column contains a unique name.
  4. The order in which data is stored does not matter.

Example:

Consider a table for storing customer orders:

OrderID CustomerName CustomerPhone Product1 Product2 Product3 ...
1 John Doe 123-456-7890 Widget A Widget B NULL ...
2 Jane Smith 987-654-3210 Widget C NULL NULL ...

This table is not in 1NF because it has repeating groups (Product1, Product2, Product3, ...).

To convert this to 1NF, we should remove the repeating groups and ensure each column contains only atomic values:

OrderID CustomerName CustomerPhone Product
1 John Doe 123-456-7890 Widget A
1 John Doe 123-456-7890 Widget B
2 Jane Smith 987-654-3210 Widget C

Now, each column contains atomic values and there are no repeating groups.

Second Normal Form (2NF)

Definition: A table is in the second normal form if:

  1. It is in 1NF.
  2. All non-key attributes are fully functional dependent on the primary key (i.e., there are no partial dependencies).

Example:

Consider the 1NF table from above:

OrderID CustomerName CustomerPhone Product
1 John Doe 123-456-7890 Widget A
1 John Doe 123-456-7890 Widget B
2 Jane Smith 987-654-3210 Widget C

Here, CustomerName and CustomerPhone are partially dependent on OrderID since they depend on the customer, not the specific product in the order.

To achieve 2NF, we separate the table into two tables: one for orders and another for customers:

Orders Table:

OrderID CustomerID Product
1 1 Widget A
1 1 Widget B
2 2 Widget C

Customers Table:

CustomerID CustomerName CustomerPhone
1 John Doe 123-456-7890
2 Jane Smith 987-654-3210

Now, all non-key attributes are fully dependent on the primary key of each table.

Third Normal Form (3NF)

Definition: A table is in the third normal form if:

  1. It is in 2NF.
  2. There are no transitive dependencies (i.e., non-key attributes do not depend on other non-key attributes).

Example:

Consider our 2NF tables:

Orders Table:

OrderID CustomerID Product
1 1 Widget A
1 1 Widget B
2 2 Widget C

Customers Table:

CustomerID CustomerName CustomerPhone
1 John Doe 123-456-7890
2 Jane Smith 987-654-3210

Assume we have another table for products with product details:

Products Table:

ProductID ProductName Price
A Widget A 10.00
B Widget B 15.00
C Widget C 20.00

To achieve 3NF, ensure there are no transitive dependencies. In our case, there are none since ProductName and Price are attributes of the Products table and are not dependent on any other non-key attribute.

The final structure is:

Orders Table:

OrderID CustomerID ProductID
1 1 A
1 1 B
2 2 C

Customers Table:

CustomerID CustomerName CustomerPhone
1 John Doe 123-456-7890
2 Jane Smith 987-654-3210

Products Table:

ProductID ProductName Price
A Widget A 10.00
B Widget B 15.00
C Widget C 20.00

Summary

  1. 1NF: Ensure each table column contains atomic values and there are no repeating groups.
  2. 2NF: Ensure the table is in 1NF and all non-key attributes are fully dependent on the primary key.
  3. 3NF: Ensure the table is in 2NF and there are no transitive dependencies.

By following these steps, you can normalize your database to ensure it is well-structured and efficient, reducing redundancy and ensuring data integrity.

💖 💪 🙅 🚩
muhammad_salem
Muhammad Salem

Posted on July 9, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

What was your win this week?
weeklyretro What was your win this week?

November 29, 2024

Where GitOps Meets ClickOps
devops Where GitOps Meets ClickOps

November 29, 2024

How to Use KitOps with MLflow
beginners How to Use KitOps with MLflow

November 29, 2024

Modern C++ for LeetCode 🧑‍💻🚀
leetcode Modern C++ for LeetCode 🧑‍💻🚀

November 29, 2024