Database Management Systems 4

TL;DR

Last blog post we covered ACID properties, Concurrency, and Recovery in our database systems.

This post I have dedicated time to cover Online Transaction Processing vs Online Analytical Processing as well as Indexing and Hashing in our databases.

To note, OLTP vs OLAP looks requires some fundamental knowledge of database systems and maybe ETL (optional). Aside from that Indexing and Hashing are ways to optimize our query performance.

Introduction

By the end of this tutorial, you will understand

OLTP vs OLAP
Indexing vs Inverted Indexing
Hashing

Core Concepts

OLTP

Online transaction processing provides transaction focused applications for our data. Examples to clarify would be ATM's are OLTP based applications.

This process consist of current data, and has business task use cases. Also composed of both read and write operations when we think of our database implementations.

OLAP

Online analytical processing provides software tools/applications focused of data analytics. An example to clarify would be Netflix's movie recommendation system.

This process consist of historical data, and the data is used in planning/problem solving for business's as well as decision making. Will primarily be composed of only read and rarely any write operations.

Side Note

I really believe taking time to understand the Extract Transform Load process can help clarify further what OLTP vs OLAP will provide you in your system architecture so definitely check that out.

Indexing vs Inverted Indexing

Indexing is a way to optimize the performance of a database by minimizing the number of disk accesses required when a query is processed.

Example:

CREATE INDEX index_name
ON table_name (column1, column2, ...);

IMPORTANT: Updating a table with indexes takes more time than updating a table without (because the indexes also need an update). So, only create indexes on columns that will be frequently searched against.

Inverted Indexing is a way of mapping content such as elements, data or objects to a particular data object or document. (Google Search: Inverted Indexing)

In short, indexing is a way to highly optimize the retrieval of information and data through stored queries.

Hashing

Hashing is an efficient technique to directly search the location of desired data on the disk without using index structure.

Typically searching through all the data points in hopes of quick retrieval can bog down or inadvertently have some inefficiency when it comes to index's. Hashing solves this.

To conclude, hashing involves looking at data thats stored on blocks of data addressed via a hash function. This works as the particular address is where that memory allocation will live and is known as a data bucket.

This concludes part 4 of the series, I will post more on part 5 shortly.