Modeling a SQL Database: Data Oriented Design
Justin Ho
Posted on September 20, 2020
Introduction
Hi, it me
Welcome to the first post in my series of modelling better databases using entity relationship modeling. This post is a high level overview to data-oriented design in the context of relational databases which can change how we approach designing a database. Instead of the object-oriented design model many programmers are used to, which could mean encapsulating both data and functionality into a single object, a data-oriented design model considers data and functionality as separate entities.
This design method was originally documented with game development in mind but its goals and implementations are closely related to how SQL works. That is not to say object-oriented approaches are worse or obsolete, just a different method to look at the problem.
Data-Oriented VS Object-Oriented
It's showdown time
Let's demonstrate some differences between the two approaches with a project example.
Scenario
You need to develop a new course enrollment application for a school. The application users are students and teachers, who are able to assign themselves to a course in their respective roles. Each course can only have one teacher and a limited number of spots for students.
I'll leave out the rest of the details as this is all we need to point out some differences in the approaches.
Object-Oriented Approach
Using an object-oriented approach, one might start with an overarching class which encapsulates the capabilities of both teacher and student. For instance, a User
class which denotes the children class' ability to interact with the system, as well as a Course
class which will contain course data and methods. Next, we might create the Teacher
and Student
class which inherits as well as overrides some methods in the User
class.
Here's what it might look like in code form (in C# syntax):
public abstract class User {
private string _name;
// ...other fields
public abstract bool IsEnrolledInCourse(string courseName);
// ...other methods
}
public class Student : User {
private float _gpa;
// ...other fields
public override bool IsEnrolledInCourse(string courseName) { /* some implementation */ }
// ...other methods
}
public class Teacher : User {
// ...fields
public override bool IsEnrolledInCourse(string courseName) { /* some implementation */ }
// ...other methods
}
public class Course {
private string _name;
// ...other fields & methods
}
Data-Oriented Approach
On the other hand, a data-oriented approach would refactor the IsEnrolledInCourse()
function outside of the Teacher
and Student
classes. In fact, any functions which mutate or checks the state in any of the classes above would be refactored out (so it does not matter if the IsEnrolledInCourse()
logic was in the Course
class). Now let's see how the refactored code looks first, then I'll explain why we would choose to do this.
public static class Students {
public static List<string> Name;
public static List<float> Gpa;
// ...other fields
}
public static class Teachers {
public static List<string> Name;
// ...other fields
}
public static class Courses {
public static List<string> Name;
// ...other fields
}
public static class EnrolledStudents {
public static List<string> courseName;
public static List<string> studentName;
}
public static class EnrolledTeachers {
public static List<string> courseName;
public static List<string> teacherName;
}
Key differences:
- Instead of objects, each class becomes a reference for all data of the specific entity
- There should never be a null value as only valid data should be inserted or removed
- State does not need to be maintained on the data type
- Ex) To check if a student is enrolled in at least one class, you can check if the student's name is in the list of enrolled students instead of maintaining a
bool _isEnrolled
variable on the student object
- Ex) To check if a student is enrolled in at least one class, you can check if the student's name is in the list of enrolled students instead of maintaining a
- Polymorphism is handled by just having different entities (splitting Teacher and Student into its own object) and applying different functions on them
If you haven't used SQL, this also happens to resemble how SQL organizes records of data (tables)!
Closing Thoughts
I hope this gave you new perspective on how programming can be approached and that object-oriented design pattern is not the perfect solution everytime. For those who prefer functional programming, this might just seem all too familiar because the focus is to separate functionality from data, but this details how data should be organized and accessed.
This concludes the introduction to relational databases at a high level and in the next post I will be going over how we can visually describe data-oriented design using entity relationship models.
If you made it this far, let me know in the comments how you feel about data-oriented design and SQL. Are you a beginner and looking to get into databases? Or do you deploy databases at scale and feel I'm totally wrong? Let me know!
Credits & Further Readings
Original Article on Data-Oriented Design by Noel Llopis
Book on Data-Oriented Design by Richard Fabian
Data-Oriented Design in Practice: Unity DOTS
I was inspired to write this after reading Tyler's post on object relational mapping (ORM) libraries, check it out!
Posted on September 20, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.