PostgreSQL Process and Memory Architecture: (CHAPTER 2: THE INTERNALS OF PostgreSQL)

In my previous post I wrote about the basics of PostgreSQL and also made a foundation so that we all can learn easily and move forward. Hence, Following that trajectory, here I am adding one another post about process and memory architecture respectively. Starting with some background, PostgreSQL is a powerful open-source relational database management system that has been gaining immense popularity over the years. It offers robust features and scalability, making it a go-to choice for various applications, from small-scale web applications to large-scale data warehousing. However, before diving into the details of PostgreSQL, it's essential to understand its process and memory architecture.

In this blog, I will be providing an overview of the PostgreSQL process architecture and memory architecture, focusing on the multi-process architecture, and explain the different types of processes and their roles in managing the database cluster.

Process Architecture

PostgreSQL is a client-server type relational database management system that operates on a single host. It employs a multi-process architecture, where several processes collaborate to manage a single database cluster. A collection of these processes is known as a 'PostgreSQL server.' The following types of processes are part of a PostgreSQL server:

Postgres Server Process

The postgres server process is the parent of all processes related to a database cluster management. It allocates a shared memory area in memory, starts various background processes, starts replication associated processes, and background worker processes if necessary.

Backend Processes

A backend process, also called postgres, is started by the postgres server process and handles all queries issued by one connected client. It communicates with the client by a single TCP connection and terminates when the client gets disconnected. PostgreSQL allows multiple clients to connect simultaneously, with the maximum number of clients controlled by the configuration parameter max_connections (default is 100).

Memory Architecture:

PostgreSQL uses shared memory to allow multiple processes to access data simultaneously. It employs a cache-based memory management system that stores frequently accessed data in memory to improve query performance. The shared buffer pool, a large block of shared memory, is the main memory area in which PostgreSQL buffers data pages. Each backend process accesses the shared buffer pool to read and write data pages.

The WAL (Write-Ahead Logging) buffer, another shared memory area, is used to store a copy of the transaction log data before it's written to the WAL file. It enables PostgreSQL to recover from system crashes or other failures without losing any data.

Key Takeaways:

To sum all, I have provided an overview of the process and memory architecture of PostgreSQL, emphasizing the multi-process architecture, shared memory, and different types of processes involved in managing the database cluster. Understanding the PostgreSQL process and memory architecture is crucial to building robust and scalable applications on this powerful database management system. In the coming chapters, we will delve deeper into each of the processes and their functions to gain a more comprehensive understanding of PostgreSQL.

References:

https://www.interdb.jp/pg/pgsql02.html

Blog