Normal data and ALTER to improve your database

mtrageser

MTrageser

Posted on September 28, 2020

Normal data and ALTER to improve your database

Hello and welcome back to Today in Tech.

This week's blog post will discuss what normalization of a database is and follow up by explaining how to implement and use normal data types to improve my COVID 19 database. As a recap from last week, we were tasked with creating a database from our entity relationship diagrams modeling one’s risk of contracting COVID 19. Using “CRUD” commands of SQL, I demonstrated how to manipulate data and construct the database. This week using normal data to my advantage, I will be showing how to improve the readability and flow of the database.

Normalization, in short, is the process of constructing a database to fit a certain mold of data. Data is classified as normal on a series of 6 levels. For this week’s post I will be sharing details on the first two levels and how to use normal data within my own database. Normal data is cumulative from one level to the next, meaning that whatever is needed to be considered first normal must also be included to be a second order classification.

The first type of normal data I will be discussing is first normal form data. The first of 6 levels, naturally first order normal data should be the most simplistic. In order to be considered first order normal, data within a table must be unique to a primary key, the data must also be reduced as far as it can be with no repeated groups.

Seen below is my Occupation table within my database. This data is considered to be first normal form because it meets all criteria specified above. The primary key in this case is OccupationID which then is inclusive of all other pieces of data in this table.

Alt Text

Using knowledge of how to construct data so that it is normal, cleaning up your database via modification becomes easier. Using the ALTER command, you can delete certain sections of your database that otherwise would be unnecessary. Data might need to be altered because it is too similar to a neighboring data set. Seen below, is my Patient data table. Two of the columns from last week's creation appear to be strikingly similar. FamilyHistory and ExistingConditions essentially mean the same thing. As I said before, I created this database from the ER diagramming I had been working on in the previous weeks. Until learning of how to simplify a database, I was unaware of how similar these datasets look. Using the ALTER command, I can edit the setup of this table to improve its simplicity.

Alt Text

The next level of complexity within a database would naturally be second normal form. To be classified as second normal form, data must first fulfill all qualifications of being first normal form as well as being exclusively reliant on the table’s primary key. An example of this is in my Hospitals table. All of the data within this table relates directly to the primary key of HospitalID. The remaining columns of data in the table are directly related to HospitalID and exclusively related to that primary key.

Alt Text

I’ve followed up this post with a screencast showing these definitions and tools in action. Thanks for reading and stay tuned for next week’s post!

💖 💪 🙅 🚩
mtrageser
MTrageser

Posted on September 28, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

What was your win this week?
weeklyretro What was your win this week?

November 29, 2024

How to Use KitOps with MLflow
beginners How to Use KitOps with MLflow

November 29, 2024

PostgreSQL Full Text Search Rank by Position
postgressql PostgreSQL Full Text Search Rank by Position

November 30, 2024