Designing a Database to Track My Sneakers

jcsh

Justin Ho

Posted on September 18, 2020

Designing a Database to Track My Sneakers
Cover Photo - Design data by Undraw

This is an undergraduate computer science project I had a chance to work on with a classmate of mine: Garry (@garrygrewal)

Preface

This article aims to be a mental walk through of applying ER modeling to SQL databases via a project example. Although I try to include links to every technical term, this is not a comprehensive beginner's guide to ER modeling. However, I did plan on posting a guide to ER modeling so please let me know via comment or DM if my explanations helped you understand ER Models better.

What is HypeTracker?

What is this "hype" and how do I get more of it?

HypeTracker is a data aggregator application which took data from social media platforms such as Reddit and Twitter to display the number of occurrences a sneaker has been mentioned by people over a period of time. This data is valuable because the prices of aftermarket sneakers can be affected by the attention or "hype" of said sneaker at a certain point in time. By collecting this data and displaying it in graphical form, users can make informed decisions about whether to purchase a certain sneaker based on the perceived value through social media attention and comparing it against the price history of the item.

Designing a Database Using ER Modeling

This project was largely data-centric and so we wanted to design the methods in which we are storing and accessing data in order to not run into issues such as incorrect data relationships or duplicate entries. The errors described could have forced us to delete our database and restart from scratch which can be disheartening depending on the progress of the project.

Entities & Attributes

Can I re-roll for more strength?

In HypeTracker, we used entity relationship (ER) modeling as a way to visually describe our data model before implementing it in SQL. We started by listing out our most important entities (real world objects):

  • Sneakers - the topic focus of this application
  • Members - the users of our application
  • Rankings - stores the historical data of mentions / occurrences of a sneaker

Next, we wrote down some attributes (characteristics or information) we had in mind for each entity.

Sneakers Members Rankings
name name platform
brand email mentions
price* password date

*** retail price at launch (not price history)

Keep in mind that this is only the initial set of items we thought of and more attributes were added later on. However, this chart gave us a baseline for the most important attributes needed for our application and a simplistic view to refer back to once our data model becomes complicated.

Relationships Between Entities

What are we?

In the next phase, we began to define the relationships between entities through simple scenarios of how the entities interact.

Sneakers and Members

  • each sneaker may be monitored by one or more member
  • each member may monitor one or more sneaker

Sneakers and Rankings

  • each sneaker can have zero or more rankings
  • each ranking can only contain one sneaker

*** Members and Rankings entities have no relation between them

Entity relationships can be modeled by their cardinality, which adds a numerical representation to their relations. For example, sneakers and members have a many-to-many relationship, because one sneaker can be watched by many members while one member may watch many sneakers. Meanwhile, sneakers and rankings have a one-to-many relationship because one sneaker can have zero or more ranks associated with it, but each ranking can only describe one sneaker.

Translating all of that into symbols using Crow's foot notation for cardinality, this is what our ER diagram looks like at this point.

Initial conceptual diagram modelling relationships between entities listed earlier

Initial ER Diagram With Important Entities and Their Relationships

Weak Entity Sets

Apes strong together

A key point to identify at this point is that the Rankings entity does not exist without at least one Sneaker entity. This creates a different type of relationship where the weak entity (Rankings) has an existence dependency on the stronger entity (Sneakers). We can represent this by changing the relationship into a double diamond, changing the weak entity into a double rectangle, and using two lines between the weak entity and weak relationship.

Modified ER Diagram After Account For the Weak Entity (Rankings)

ER Diagram After Modifications for Weak Entity Sets

Many-To-Many Relationships

There's not enough room for all of us in this relationship

Unfortunately we were not finished with this data model yet. Another glaring issue was the cardinality between the Sneakers and Members entities. Many-to-many relationships creates problems in SQL such as how can one members row in the database store many sneakers at the same time? In addition, there are other issues and proposed solutions which you can read about in this article, but the recommended solution is to use an associative entity.

Using an associative entity, we can refactor the relationship between sneakers and members into a new Watchlist entity which keeps track of members and their sneakers.

ER Diagram after adding an associative entity between Sneakers and Members

ER Diagram After Modifications for Associative Entities

Finishing Up

That was easy

Now we were basically done! Just add in the foreign key(s) as attributes and underline the primary key(s) and this was the resulting ER diagram:

The Final ER Diagram

ER Diagram Complete!

Here is the SQL file to implement this in MariaDB (v10.1.35):

/* hypetracker.sql */

CREATE DATABASE IF NOT EXISTS HypeTracker;

USE HypeTracker;

CREATE TABLE IF NOT EXISTS `Sneakers` (
  `Name` VARCHAR(100) NOT NULL,
  `Price` DECIMAL(9,2) UNSIGNED,
  `Brand` VARCHAR(40) NOT NULL,
  CONSTRAINT `PK_Sneakers` PRIMARY KEY (Name)
) ENGINE=InnoDB;

CREATE TABLE IF NOT EXISTS `Members` (
  `Name` VARCHAR(40) NOT NULL,
  `Email` VARCHAR(100) NOT NULL,
  `Password` VARCHAR(40) NOT NULL,
  CONSTRAINT `PK_Members` PRIMARY KEY (Email)
) ENGINE=InnoDB;

CREATE TABLE IF NOT EXISTS `Watchlist` (
  `SneakerName` VARCHAR(100) NOT NULL,
  `MemberEmail` VARCHAR(100) NOT NULL,
  CONSTRAINT `PK_Watchlist` 
    PRIMARY KEY (SneakerName, MemberEmail),
  CONSTRAINT `FK_Watchlist_Sneakers`
    FOREIGN KEY (SneakerName) 
    REFERENCES Sneakers(Name)
    ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `FK_Watchlist_Members`
    FOREIGN KEY (MemberEmail) 
    REFERENCES Members(Email)
    ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB;

CREATE TABLE IF NOT EXISTS `Rankings` (
  `Platform` VARCHAR(100) NOT NULL,
  `Date` DATE NOT NULL,
  `SneakerName` VARCHAR(100) NOT NULL,
  `Mentions` INT UNSIGNED NOT NULL,
  CONSTRAINT `PK_Rankings`
    PRIMARY KEY (Platform, Date, SneakerName),
  CONSTRAINT `FK_Rankings_Sneakers`
    FOREIGN KEY (SneakerName)
    REFERENCES Sneakers(Name)
    ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB;

Enter fullscreen mode Exit fullscreen mode

Here are some example SQL queries to retrieve data for certain scenarios:


/* Get All Sneakers Watched by a Member */
SELECT S.Name, S.Price, S.Brand
FROM Sneakers S
INNER JOIN Watchlist W
ON S.Name = W.SneakerName
WHERE W.MemberEmail = '$email'; /* $email is a PHP variable here */

/* Get the 5 Most Mentioned Sneakers In the Last Week */
SELECT S.Name, S.Price, S.Brand, RS.TotalMentions
FROM Sneakers S
INNER JOIN
(
  SELECT R.SneakerName, SUM(Mentions) AS TotalMentions
  FROM Rankings R
  WHERE R.Date > DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
  GROUP BY R.SneakerName
  ORDER BY TotalMentions DESC
  LIMIT 5
) RS
ON S.Name = RS.SneakerName;

Enter fullscreen mode Exit fullscreen mode

Takeaways and Learnings

It's not over yet!

To recap, we designed a relational database based on our requirements using the entity relationship model in order to visualize our database before implementation. I learned how to express data relationships through cardinality and how to refactor many-to-many relationships so that it will work nicely in SQL. Although creating ER diagrams can be tedious, this is an important process to verify our design decisions in order to avoid simple dependency or redundancy issues later on. I am continuously learning more about SQL and this write-up details an iteration of the project after it had been implemented in PHP.

Data Normalization

That being said, we were working with a relatively simple model due to the small number of entities we needed and did not run in more issues which will require more normalization techniques. I've avoided using this term in the post because it is a complicated topic on its own and we were able to achieve a data model in Boyce-Codd Normal Form (BCNF) just by one refactoring step. If you are planning to learn more about databases, I would suggest looking at the different normal forms and normalization techniques, as well as relational algebra and relational calculus to express your SQL queries more effectively.

GitHub logo jcsho / hype-tracker

Full stack web application to read data from public APIs (Twitter, Reddit) and form visualizations

HypeTracker

Current Version Codacy Badge StyleCI License FOSSA Status


HypeTracker is a data aggregator application for social media 'impressions' on sneakers bought and sold in the aftermarket.

See my blog detailing what HypeTracker is and why I am refactoring it

Changelog / Goals for V2

  • migrate database schema see PR #4
  • migrate internal sneakers API see PR #7
  • migrate front-end see PR #17
  • deploy as demo / push as v2
  • add social media api scraping

Getting Started

Requirements

  • PHP ^7.3
  • Composer ^1.10.13
  • Vagrant ^2.2.10 (with your choice of virtualization layer)
  • (example) VirtualBox ^6.1.14 (with Vagrant's VirtualBox provider)
$ git clone https://github.com/justinhodev/hype-tracker.git
$ cd hype-tracker

# Install Composer Dependencies
$ composer install

# Prepare config for Homestead
$ composer homestead

# Provision the VM
$ vagrant up

# SSH into Vagrant Box
$ vagrant ssh
Enter fullscreen mode Exit fullscreen mode

Docker

coming soon

Database Entity Relationship Model

ER Diagram

License

FOSSA Status






Discussion

Let me know if this article helped you or if you plan on learning more about SQL in the comments below!

πŸ’– πŸ’ͺ πŸ™… 🚩
jcsh
Justin Ho

Posted on September 18, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related