Normal data and ALTER to improve your database
MTrageser
Posted on September 28, 2020
Hello and welcome back to Today in Tech.
This week's blog post will discuss what normalization of a database is and follow up by explaining how to implement and use normal data types to improve my COVID 19 database. As a recap from last week, we were tasked with creating a database from our entity relationship diagrams modeling one’s risk of contracting COVID 19. Using “CRUD” commands of SQL, I demonstrated how to manipulate data and construct the database. This week using normal data to my advantage, I will be showing how to improve the readability and flow of the database.
Normalization, in short, is the process of constructing a database to fit a certain mold of data. Data is classified as normal on a series of 6 levels. For this week’s post I will be sharing details on the first two levels and how to use normal data within my own database. Normal data is cumulative from one level to the next, meaning that whatever is needed to be considered first normal must also be included to be a second order classification.
The first type of normal data I will be discussing is first normal form data. The first of 6 levels, naturally first order normal data should be the most simplistic. In order to be considered first order normal, data within a table must be unique to a primary key, the data must also be reduced as far as it can be with no repeated groups.
Seen below is my Occupation table within my database. This data is considered to be first normal form because it meets all criteria specified above. The primary key in this case is OccupationID which then is inclusive of all other pieces of data in this table.
Using knowledge of how to construct data so that it is normal, cleaning up your database via modification becomes easier. Using the ALTER command, you can delete certain sections of your database that otherwise would be unnecessary. Data might need to be altered because it is too similar to a neighboring data set. Seen below, is my Patient data table. Two of the columns from last week's creation appear to be strikingly similar. FamilyHistory and ExistingConditions essentially mean the same thing. As I said before, I created this database from the ER diagramming I had been working on in the previous weeks. Until learning of how to simplify a database, I was unaware of how similar these datasets look. Using the ALTER command, I can edit the setup of this table to improve its simplicity.
The next level of complexity within a database would naturally be second normal form. To be classified as second normal form, data must first fulfill all qualifications of being first normal form as well as being exclusively reliant on the table’s primary key. An example of this is in my Hospitals table. All of the data within this table relates directly to the primary key of HospitalID. The remaining columns of data in the table are directly related to HospitalID and exclusively related to that primary key.
I’ve followed up this post with a screencast showing these definitions and tools in action. Thanks for reading and stay tuned for next week’s post!
Posted on September 28, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.