Normalization of Tables: From 4NF to 5NF

harshmange

Harsh Mange

Posted on April 21, 2023

Normalization of Tables: From 4NF to 5NF

Fourth Normal Form (4NF) is a level of database normalization that is designed to handle situations where a table has multiple independent, multi-valued dependencies. However, there are situations where even 4NF may not be enough to ensure that a database is free from all anomalies. This is where the fifth normal form (5NF) comes in.

In this answer, we will explain the difference between 4NF and 5NF and provide an example to illustrate the normalization process.

4NF requires that a table has no multi-valued dependencies. However, in some cases, a table may have overlapping multi-valued dependencies, which means that the same set of attributes can determine different sets of values. This can lead to data anomalies and redundancies.

Fifth normal form (5NF) is a level of database normalization that addresses this issue. 5NF requires that a table has no join dependencies, which means that no set of attributes can be determined by two or more independent sets of attributes.

Here is an example that illustrates the difference between 4NF and 5NF:

Consider a table that stores information about customers, their orders, and the products that were ordered:

Customer_Order_Product
-----------------------
CustomerID
OrderID
ProductID
CustomerName
OrderDate
ProductName
ProductDescription

Enter fullscreen mode Exit fullscreen mode

In this table, the primary key is a combination of CustomerID, OrderID, and ProductID. The non-key attributes CustomerName, OrderDate, ProductName, and ProductDescription are dependent on their respective determinants. This table is in 4NF because it has no multi-valued dependencies.

However, this table has overlapping dependencies between the CustomerID and OrderID attributes. This means that the same set of CustomerID and OrderID values can determine different sets of ProductName and ProductDescription values, which can lead to data anomalies and redundancies.

To normalize this table to 5NF, we need to split it into three tables:

Customer
--------
CustomerID (primary key)
CustomerName

Order
-----
CustomerID (foreign key)
OrderID (primary key)
OrderDate

Product
-------
ProductID (primary key)
ProductName
ProductDescription

Order_Product
-------------
OrderID (foreign key)
ProductID (foreign key)

Enter fullscreen mode Exit fullscreen mode

In this normalized schema, the Customer, Order, and Product tables contain only information about the customers, orders, and products, respectively. The Order_Product table contains only the relationship between orders and products, with foreign keys to the Order and Product tables.

This ensures that there are no join dependencies between the attributes in the schema.

To summarize, 4NF ensures that a table has no multi-valued dependencies, while 5NF ensures that there are no join dependencies between the attributes in the schema. If a table has overlapping multi-valued dependencies, 5NF may be necessary to ensure that the schema is free from anomalies.

💖 💪 🙅 🚩
harshmange
Harsh Mange

Posted on April 21, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

What was your win this week?
weeklyretro What was your win this week?

November 29, 2024

Where GitOps Meets ClickOps
devops Where GitOps Meets ClickOps

November 29, 2024

How to Use KitOps with MLflow
beginners How to Use KitOps with MLflow

November 29, 2024