Normalization and Normal Forms (1NF, 2NF, 3NF)

kellyblaire

Kelly Okere

Posted on June 7, 2024

Normalization and Normal Forms (1NF, 2NF, 3NF)

Introduction

Normalization is a systematic approach to organizing data in a database to reduce redundancy and improve data integrity. The process involves decomposing a table into smaller, related tables without losing data. This article will explain the concepts of normalization and the different normal forms (1NF, 2NF, 3NF), providing clear illustrations and examples to help students understand these concepts thoroughly.

What is Normalization?

Normalization involves structuring a relational database in a way that minimizes redundancy and dependency by organizing fields and table relations. The primary goals of normalization are to:

  • Eliminate redundant data.
  • Ensure data dependencies make sense.
  • Reduce the potential for anomalies during data operations (insertion, update, deletion).

Normal Forms

Normal forms are a series of guidelines that a relational database must follow to be considered normalized. Each normal form builds on the previous one, creating a series of increasingly stringent rules.

First Normal Form (1NF)

A table is in the First Normal Form if:

  1. All the values in a table are atomic (indivisible).
  2. Each column contains values of a single type.
  3. Each column contains unique values.
  4. The order in which data is stored does not matter.

Example of 1NF

Consider a table that stores information about students and their courses:

StudentID StudentName Courses
1 John Doe Math, Science
2 Jane Smith History, Math

This table is not in 1NF because the Courses column contains multiple values. To convert it to 1NF, we need to ensure that each column contains atomic values:

StudentID StudentName Course
1 John Doe Math
1 John Doe Science
2 Jane Smith History
2 Jane Smith Math

Second Normal Form (2NF)

A table is in the Second Normal Form if:

  1. It is in 1NF.
  2. All non-key attributes are fully functionally dependent on the primary key.

This means that there should be no partial dependency of any column on the primary key. In other words, all columns must depend on the entire primary key.

Example of 2NF

Consider the following table that stores information about students, courses, and instructors:

StudentID CourseID StudentName CourseName InstructorName
1 101 John Doe Math Dr. Smith
1 102 John Doe Science Dr. Jones
2 101 Jane Smith Math Dr. Smith
2 103 Jane Smith History Dr. Brown

This table is in 1NF but not in 2NF because StudentName depends only on StudentID and CourseName, InstructorName depend only on CourseID, not on the combination of StudentID and CourseID. To convert it to 2NF, we decompose the table into two tables:

Students Table:
| StudentID | StudentName |
|-----------|-------------|
| 1 | John Doe |
| 2 | Jane Smith |

Courses Table:
| CourseID | CourseName | InstructorName |
|----------|------------|----------------|
| 101 | Math | Dr. Smith |
| 102 | Science | Dr. Jones |
| 103 | History | Dr. Brown |

Enrollment Table:
| StudentID | CourseID |
|-----------|----------|
| 1 | 101 |
| 1 | 102 |
| 2 | 101 |
| 2 | 103 |

Third Normal Form (3NF)

A table is in the Third Normal Form if:

  1. It is in 2NF.
  2. There are no transitive dependencies.

A transitive dependency occurs when a non-key column is dependent on another non-key column.

Example of 3NF

Consider the following table:

StudentID CourseID CourseName InstructorName InstructorOffice
1 101 Math Dr. Smith Room 101
1 102 Science Dr. Jones Room 102
2 101 Math Dr. Smith Room 101
2 103 History Dr. Brown Room 103

This table is in 2NF but not in 3NF because InstructorOffice is dependent on InstructorName, which is not a key. To convert it to 3NF, we decompose it further:

Students Table:
| StudentID | StudentName |
|-----------|-------------|
| 1 | John Doe |
| 2 | Jane Smith |

Courses Table:
| CourseID | CourseName | InstructorName |
|----------|------------|----------------|
| 101 | Math | Dr. Smith |
| 102 | Science | Dr. Jones |
| 103 | History | Dr. Brown |

Instructors Table:
| InstructorName | InstructorOffice |
|----------------|------------------|
| Dr. Smith | Room 101 |
| Dr. Jones | Room 102 |
| Dr. Brown | Room 103 |

Enrollment Table:
| StudentID | CourseID |
|-----------|----------|
| 1 | 101 |
| 1 | 102 |
| 2 | 101 |
| 2 | 103 |

Summary

Normalization is an essential process in database design that aims to reduce redundancy and ensure data integrity. By following the rules of normalization and moving through the different normal forms (1NF, 2NF, 3NF), we can create a well-structured database that minimizes data anomalies and supports efficient data operations. Understanding and applying these principles is fundamental for anyone involved in database design and management.

💖 💪 🙅 🚩
kellyblaire
Kelly Okere

Posted on June 7, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

What was your win this week?
weeklyretro What was your win this week?

November 29, 2024

Where GitOps Meets ClickOps
devops Where GitOps Meets ClickOps

November 29, 2024

How to Use KitOps with MLflow
beginners How to Use KitOps with MLflow

November 29, 2024