What's coming to PostgreSQL ? 🐘

gaspardb

Gaspard Boursin

Posted on April 30, 2024

What's coming to PostgreSQL ? 🐘

PostgreSQL, the powerful, reliable and highly extensible open-source relational DBMS, has been a mainstay in the database world for over three decades, gaining a loyal following among developers and organizations alike, for an ever-growing number of use case.

In the past couple years, amidst the increasing importance of "global-scale" P/IaaS like Vercel and Cloudflare, many MySQL or SQLite-based solutions like Planetscale, Turso, and Cloudflare D1 have been making waves with their serverless architecture, global redundancy promises, unlimited storage and easy scalability.

At the same time, PostgreSQL is undergoing a series of transformations that could redefine its role in the database landscape.
In this article, we'll explore some of the exciting developments shaping the future of PostgreSQL, with a focus on the contributions of companies like Supabase, Neon, ParadeDB or AWS.


Separation of Compute and Storage

One of the most significant advancements coming to the PostgreSQL ecosystem lies in the separation of compute and storage. Companies like Neon, AWS, and recently Supabase are pioneering this approach, which brings major benefits:

  • Horizontal Scalability: Just add more compute nodes to allow the system to handle increased load without hitting performance walls.
  • Cost-Effective Scaling: The separation allows for independent scaling of compute and storage resources based on specific workload requirements, optimizing costs and possibly offloading storage to S3.
  • Seamless Replication and High Availability: The decoupled architecture simplifies replication and enables high availability setups.
  • Branching and Improved Developer Experience: The storage layer can be easily forked and attached to new compute instances, facilitating branching and enhancing the development workflow.

For a deeper dive into the benefits and implications of this architecture, I recommend exploring this insightful blog post from David Gomes and the linked resources.

Neon is already offering a production-ready postgres serverless offering using this technology, all open source, jump to their hello-world post for an overview of their approach.


Pluggable Storage and Improved Table Engines

At the heart of every database lies its table engine.
In Postgres 12, a basic support for a "Pluggable Storage" was introduced, as a way to define different table engines for different tables in the same database.

This unlocks the potential to use a single PostgreSQL instance for various use cases, from OLAP and AI vector search to log search, without compromising performance or storage efficiency.

However the initial implementation had limitations, stalled and didn't fully achieve its goals.

Some solutions like CitusData's columnar storage leveraged this initial API but came with significant restrictions over core PostgreSQL features.

Enter Oriole.
Acquired by Supabase officially in April 2024, Oriole is designed as a drop-in replacement for the existing storage engine, bringing notable performance improvements and advanced features. It reduces I/O (enabling decoupled storage and compute), supports page-level data compression, and introduces index-organized tables.

create extension orioledb; -- enable the extension

create table posts (
  id int8 not null,
  title text not null,
  body text not null,
  primary key (id)
) using orioledb;    -- Specify the storage format
Enter fullscreen mode Exit fullscreen mode

The Oriole team is actively driving contributions around Pluggable Storage, building upon the efforts initiated in PostgreSQL 12 to create a more integrated extension ecosystem, supporting many more advanced extensions and use cases on standard PostgreSQL instances.


A few words about us

Our project, Meteroid, is a modern and open-source billing platform that focuses on business intelligence and actionable insights.

To allow our users to self-host without our complete saas-focused infrastructure, including a Clickhouse cluster, we've been researching options to leverage Postgres more intensively for all our different use cases. We figured we would share some of our findings about the ecosystem through this blog post.

If you have a minute, please consider starring us on Github ⭐️ ! This would help us a lot ❤️


Extending PostgreSQL: OLAP, AI, Search, and Beyond

The PostgreSQL extension ecosystem has seen remarkable achievements lately. Two notable examples are:

  • pgvector: This community-driven effort provides a vector data type and search functionality for high-dimensional vectors. Despite its relatively small codebase, pgvector rivals the performance of many specialized vector databases.

  • ParadeDB: Implemented in Rust, this open-source PostgreSQL extension(s) delivers outstanding results in both OLAP (closing the gap with ClickHouse) and Search (outperforming Elasticsearch in some benchmarks) use cases.

With the new Pluggable Storage model, these extensions and many others could be easily added to any PostgreSQL instance, potentially even interacting with each other, unlocking powerful combinations and enabling PostgreSQL to handle a wide range of workloads efficiently.

Many other projects exist around the postgres community, to cover emerging use cases. A couple examples :

  • electric-sql (the people behind CRDTs) is building a sync layer between postgres and local/embeded SQLite database to build local-first apps, a paradigm shift toward dynamic and instant local apps

  • AWS RDS team created Trusted Language Extensions (pg_tle), an initiative to make postgres extensions more accessible to users relying on a cloud managed database. This is a pretty huge step for the open-source/self-hosted community, as until then building on top of an extension could make the deployment a lot more complex.


A couple of resources to delve deeper

  • Scaling Postgres, a youtube podcast from Creston Jamison with a lot of discussions and news curation across the postgres ecosystem

  • pg is eating the db world from @fengruohang , an extensive blog post to understand the unique position of Postgres in the ecosystem


The Future may be both Unified and Distributed

As applications become increasingly global and complex, the need for a distributed, multi-purpose database is more pressing than ever.

The incredibly extensible core of Postgres, alongside the massive community and shared efforts from many players across the industry, has led to impressive advancements. From the separation of compute and storage to a new Pluggable Storage, and the growing number of highly powerful extensions across languages and domains, Postgres is poised as a strong contender for a global, holistic database solution, to meet the demands of modern software development.

The future of PostgreSQL may not just be about being a reliable and feature-rich database, but about becoming a catalyst for building applications that are fast, scalable, and adaptable to the ever-changing needs of businesses and users.


I hope you enjoyed this read ! If you found it helpful, don't forget to give our repo a star! Your support means the world to us.

⭐️ Star Meteroid ⭐️

Cheers !

💖 💪 🙅 🚩
gaspardb
Gaspard Boursin

Posted on April 30, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related