Postgres UPSERT: Uplifting Database Updates and Inserts

0xog_pg

PGzlan

Posted on August 6, 2023

Postgres UPSERT: Uplifting Database Updates and Inserts

Table of Contents:

Introduction

In the world of databases, updating or inserting data can be a recurring, and complex task. This is especially apparent when dealing with conflicting data or unique constraints. However, Postgres provides a powerful feature called UPSERT.

UPSERT in PostgreSQL

The term upsert is actually the combination of the two words update and in*sert*. When a new row is inserted _into a table, PostgreSQL will perform an _update to the row, provided that it has found an entry for it. If not, then the new row is inserted.

This greatly eases the process of constructing queries, enabling the user to perform updates and inserts using a just one operation.

Using UPSERT in PostgreSQL

To use the upsert feature in PostgreSQL, you can use the INSERT ON CONFLICT statement as follows:

INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...)
ON CONFLICT (constraint_column)
DO action
Enter fullscreen mode Exit fullscreen mode

The ON CONFLICT target action clause was added to the INSERT statement to support the upsert feature.
(Note #1: This is not unique to PostgreSQL. Other RDBMS support it, though the syntax might be different. For example, MySQL supports upserts using the INSERT ... ON DUPLICATE KEY UPDATE statement. Similarly, SQL Server supports upserts using the MERGE statement)
(Note #2: that the ON CONFLICT clause has been introduced from PostgreSQL 9.5)

In an **upsert **statement in PostgreSQL, the DO clause specifies the action to take when a conflict occurs during the INSERT operation. There are two actions that can be specified in the DO clause of an upsert statement in PostgreSQL:

  • DO NOTHING: This action specifies that no changes should be made to the table if a conflict occurs. In other words, if a row already exists with the same values as the row being inserted, the INSERT operation will be skipped and no changes will be made to the table.
  • DO UPDATE: This action specifies that the existing row should be updated if a conflict occurs. You can use the SET clause to specify which columns should be updated and what their new values should be. The WHERE clause can also be used to specify additional conditions for the update.

Another way of performing upserts in Postgres is by using MERGE (also known as UPSERT in some other databases).
(Note #3: that the MERGE statement has been introduced from PostgreSQL 15)

Method 1: INSERT ... ON CONFLICT

The INSERT ... ON CONFLICT statement allows you to specify a conflict resolution action when a unique constraint violation occurs during an insert operation. Here's the basic syntax:

INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...)
ON CONFLICT (constraint_column)
DO UPDATE SET column1 = value1, column2 = value2, ...
Enter fullscreen mode Exit fullscreen mode

We'll use a simple example to illustrate this. Let's say we have a users table with columns id, name, and email, where id is the primary key. We want to insert a new user or update the name and email if the user already exists.

-- Create the users table
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(100)
);

-- Insert or update a user
INSERT INTO users (id, name, email)
VALUES (1, 'John Doe', 'john@upsertdemo.com')
ON CONFLICT (id)
DO UPDATE SET name = EXCLUDED.name, email = EXCLUDED.email;
Enter fullscreen mode Exit fullscreen mode

In the above example, if a user with id 1 already exists, the name and email will be updated with the new values. Otherwise, a new row will be inserted.

But wait. Where did this EXCLUDED come from?

In an upsert statement in PostgreSQL, EXCLUDED is a special table that is used to reference the values that were proposed for insertion in the INSERT statement.

The EXCLUDED table has the same columns as the table being inserted into, and its values are the values that would have been inserted if the INSERT statement had not encountered a conflict.

The EXCLUDED table is not a physical table that is stored on disk but rather a virtual table that is created and used during the execution of an INSERT ... ON CONFLICT statement in PostgreSQL and exists only in memory during the execution of the INSERT ... ON CONFLICT statement.

Method 2: MERGE (UPSERT)

The MERGE statement allows you to perform both update and insert operations based on a specified condition.

Here's the basic syntax of the MERGE statement:

MERGE INTO target_table AS target
USING source_table AS source
ON (condition)
WHEN MATCHED THEN
    UPDATE SET column1 = value1, column2 = value2, ...
WHEN NOT MATCHED THEN
    INSERT (column1, column2, ...)
    VALUES (value1, value2, ...)
Enter fullscreen mode Exit fullscreen mode

Let's make use of the same example as before and showcase the MERGE statement:

-- Create the users table
CREATE TABLE users (
   ...
);

-- Merge (UPSERT) a user
MERGE INTO users AS target
USING (VALUES (1, 'John Doe', 'john@upsertdemo.com')) AS source (id, name, email)
ON (target.id = source.id)
WHEN MATCHED THEN
    UPDATE SET name = source.name, email = source.email
WHEN NOT MATCHED THEN
    INSERT (id, name, email)
    VALUES (source.id, source.name, source.email);
Enter fullscreen mode Exit fullscreen mode

In the above example, the MERGE statement first checks if a user with id 1 exists in the users table. If it does, the name and email will be updated with the new values. Otherwise, a new row will be inserted.

Choosing the Right Method

Both INSERT ... ON CONFLICT and MERGE provide UPSERT functionality, but their usage depends on your specific requirements and the version of Postgres you are using. If you are working with Postgres versions prior to 9.5, INSERT ... ON CONFLICT is not available, and you should use alternative approaches like UPDATE followed by INSERT in separate statements.

If you have Postgres 9.5 or greater (but not 15), You can leverage the power of INSERT ... ON CONFLICT. If you are on Postgres 15 or higher, then the world is your oyster.

The choice between them depends on factors such as performance, complexity, and personal preference. It's recommended to benchmark and compare the performance of both methods with your specific workload and dataset to make an informed decision.

Conclusion

Updating or inserting data in a database can be a cumbersome task, especially when dealing with conflicting data. However, Postgres simplifies this process with its UPSERT functionality. In this blog post, we explored two methods to achieve UPSERT in Postgres: INSERT ... ON CONFLICT and MERGE.

We learned that INSERT ... ON CONFLICT allows you to specify a conflict resolution action when a unique constraint violation occurs during an insert operation. This method is suitable for Postgres versions prior to 9.5. On the other hand, MERGE (UPSERT) is introduced in Postgres 9.5 and provides a more concise and flexible syntax for performing both update and insert operations based on a specified condition.

Choosing the right method depends on your specific requirements and the version of Postgres you are using. It's important to consider factors such as performance, complexity, and personal preference when making the decision.

With the UPSERT functionality in Postgres, you can simplify your database operations, ensure data consistency, and streamline your application logic. Whether you choose INSERT ... ON CONFLICT or MERGE, UPSERT empowers you to efficiently handle conflicting data and keep your database up to date.

Happy UPSERTing!

References

UPSERT - PostgreSQL wiki.
Commit Fest
PostgreSQL: Documentation: 15: MERGE.
SQL MERGE - PostgreSQL wiki
postgresql - Difference between UPSERT and MERGE
PostgreSQL Merge | Quick Glance on PostgreSQL Merge - EDUCBA.
[How can I return EXCLUDED from WITH in PostgreSQL?.](

💖 💪 🙅 🚩
0xog_pg
PGzlan

Posted on August 6, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related