Database Normalization simplified
Muhammad Salem
Posted on July 9, 2024
Applying the normalization rules
You can apply the data normalization rules (sometimes just called normalization rules) as the next step in your design. You use these rules to see if your tables are structured correctly. The process of applying the rules to your database design is called normalizing the database, or just normalization.
Normalization is most useful after you have represented all of the information items and have arrived at a preliminary design. The idea is to help you ensure that you have divided your information items into the appropriate tables. What normalization cannot do is ensure that you have all the correct data items to begin with.
You apply the rules in succession, at each step ensuring that your design arrives at one of what is known as the "normal forms." Five normal forms are widely accepted — the first normal form through the fifth normal form. This article expands on the first three, because they are all that is required for the majority of database designs.
Sure! Let's walk through the normalization process step by step, up to the third normal form (3NF), using clear examples to illustrate each step.
First Normal Form (1NF)
Definition: A table is in the first normal form if:
- All the columns contain atomic (indivisible) values.
- Each column contains values of a single type.
- Each column contains a unique name.
- The order in which data is stored does not matter.
Example:
Consider a table for storing customer orders:
OrderID | CustomerName | CustomerPhone | Product1 | Product2 | Product3 | ... |
---|---|---|---|---|---|---|
1 | John Doe | 123-456-7890 | Widget A | Widget B | NULL | ... |
2 | Jane Smith | 987-654-3210 | Widget C | NULL | NULL | ... |
This table is not in 1NF because it has repeating groups (Product1, Product2, Product3, ...).
To convert this to 1NF, we should remove the repeating groups and ensure each column contains only atomic values:
OrderID | CustomerName | CustomerPhone | Product |
---|---|---|---|
1 | John Doe | 123-456-7890 | Widget A |
1 | John Doe | 123-456-7890 | Widget B |
2 | Jane Smith | 987-654-3210 | Widget C |
Now, each column contains atomic values and there are no repeating groups.
Second Normal Form (2NF)
Definition: A table is in the second normal form if:
- It is in 1NF.
- All non-key attributes are fully functional dependent on the primary key (i.e., there are no partial dependencies).
Example:
Consider the 1NF table from above:
OrderID | CustomerName | CustomerPhone | Product |
---|---|---|---|
1 | John Doe | 123-456-7890 | Widget A |
1 | John Doe | 123-456-7890 | Widget B |
2 | Jane Smith | 987-654-3210 | Widget C |
Here, CustomerName
and CustomerPhone
are partially dependent on OrderID
since they depend on the customer, not the specific product in the order.
To achieve 2NF, we separate the table into two tables: one for orders and another for customers:
Orders Table:
OrderID | CustomerID | Product |
---|---|---|
1 | 1 | Widget A |
1 | 1 | Widget B |
2 | 2 | Widget C |
Customers Table:
CustomerID | CustomerName | CustomerPhone |
---|---|---|
1 | John Doe | 123-456-7890 |
2 | Jane Smith | 987-654-3210 |
Now, all non-key attributes are fully dependent on the primary key of each table.
Third Normal Form (3NF)
Definition: A table is in the third normal form if:
- It is in 2NF.
- There are no transitive dependencies (i.e., non-key attributes do not depend on other non-key attributes).
Example:
Consider our 2NF tables:
Orders Table:
OrderID | CustomerID | Product |
---|---|---|
1 | 1 | Widget A |
1 | 1 | Widget B |
2 | 2 | Widget C |
Customers Table:
CustomerID | CustomerName | CustomerPhone |
---|---|---|
1 | John Doe | 123-456-7890 |
2 | Jane Smith | 987-654-3210 |
Assume we have another table for products with product details:
Products Table:
ProductID | ProductName | Price |
---|---|---|
A | Widget A | 10.00 |
B | Widget B | 15.00 |
C | Widget C | 20.00 |
To achieve 3NF, ensure there are no transitive dependencies. In our case, there are none since ProductName
and Price
are attributes of the Products table and are not dependent on any other non-key attribute.
The final structure is:
Orders Table:
OrderID | CustomerID | ProductID |
---|---|---|
1 | 1 | A |
1 | 1 | B |
2 | 2 | C |
Customers Table:
CustomerID | CustomerName | CustomerPhone |
---|---|---|
1 | John Doe | 123-456-7890 |
2 | Jane Smith | 987-654-3210 |
Products Table:
ProductID | ProductName | Price |
---|---|---|
A | Widget A | 10.00 |
B | Widget B | 15.00 |
C | Widget C | 20.00 |
Summary
- 1NF: Ensure each table column contains atomic values and there are no repeating groups.
- 2NF: Ensure the table is in 1NF and all non-key attributes are fully dependent on the primary key.
- 3NF: Ensure the table is in 2NF and there are no transitive dependencies.
By following these steps, you can normalize your database to ensure it is well-structured and efficient, reducing redundancy and ensuring data integrity.
Posted on July 9, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.