Normalization of Tables: From 4NF to 5NF
Harsh Mange
Posted on April 21, 2023
Fourth Normal Form (4NF) is a level of database normalization that is designed to handle situations where a table has multiple independent, multi-valued dependencies. However, there are situations where even 4NF may not be enough to ensure that a database is free from all anomalies. This is where the fifth normal form (5NF) comes in.
In this answer, we will explain the difference between 4NF and 5NF and provide an example to illustrate the normalization process.
4NF requires that a table has no multi-valued dependencies. However, in some cases, a table may have overlapping multi-valued dependencies, which means that the same set of attributes can determine different sets of values. This can lead to data anomalies and redundancies.
Fifth normal form (5NF) is a level of database normalization that addresses this issue. 5NF requires that a table has no join dependencies, which means that no set of attributes can be determined by two or more independent sets of attributes.
Here is an example that illustrates the difference between 4NF and 5NF:
Consider a table that stores information about customers, their orders, and the products that were ordered:
Customer_Order_Product
-----------------------
CustomerID
OrderID
ProductID
CustomerName
OrderDate
ProductName
ProductDescription
In this table, the primary key is a combination of CustomerID
, OrderID
, and ProductID
. The non-key attributes CustomerName
, OrderDate
, ProductName
, and ProductDescription
are dependent on their respective determinants. This table is in 4NF because it has no multi-valued dependencies.
However, this table has overlapping dependencies between the CustomerID
and OrderID
attributes. This means that the same set of CustomerID
and OrderID
values can determine different sets of ProductName
and ProductDescription
values, which can lead to data anomalies and redundancies.
To normalize this table to 5NF, we need to split it into three tables:
Customer
--------
CustomerID (primary key)
CustomerName
Order
-----
CustomerID (foreign key)
OrderID (primary key)
OrderDate
Product
-------
ProductID (primary key)
ProductName
ProductDescription
Order_Product
-------------
OrderID (foreign key)
ProductID (foreign key)
In this normalized schema, the Customer
, Order
, and Product
tables contain only information about the customers, orders, and products, respectively. The Order_Product
table contains only the relationship between orders and products, with foreign keys to the Order
and Product
tables.
This ensures that there are no join dependencies between the attributes in the schema.
To summarize, 4NF ensures that a table has no multi-valued dependencies, while 5NF ensures that there are no join dependencies between the attributes in the schema. If a table has overlapping multi-valued dependencies, 5NF may be necessary to ensure that the schema is free from anomalies.
Posted on April 21, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.