Antonello Zanini
Posted on January 19, 2024
PostgreSQL is one of the most advanced and widely used open-source RDBMS (Relational Database Management System). It is particularly appreciated by the developer community because it supports both SQL and JSON querying. So it is both relational and non-relational compliant.
Yet, it has some well-known issues, and one of the most annoying ones involves sequences. In detail, these are prone to go out of sync and this prevents you from inserting new rows. On top of that, in these cases, PostgreSQL returns an error message that does not allow you to easily understand what is going on.
Thus, knowing about the existence of the out-of-sync issue on sequences is crucial, because addressing it is not complex, and being able to identify it immediately can save your day. Now, let's see what sequences are, why this issue happens, how to spot it, and how to finally solve it.
Delving into the out-of-sync sequence problem in PostgreSQL
Let's now focus on the out-of-sync issue that occurs on PostgreSQL sequences, but first let's try to understand what a sequence is.
What is a PostgreSQL sequence?
As explained in the official documentation, a sequence is nothing more than a number generator. In particular, PostgreSQL uses a sequence to generate a progressive number, which usually represents an automatically incremented numeric primary key.
If you are familiar with MySQL, the result of having a sequence in PostgreSQL is comparable to the AUTO_INCREMENT
behavior. The main difference between the two is that the PostgreSQL sequences can also start from a defined value and then decrement it at each INSERT.
You can define a sequence in PostgreSQL with the CREATE SEQUENCE
statement. Similarly, the special type SERIAL
initializes an auto-incremental numeric primary key using a sequence behind the scenes. Generally, you should use SERIAL when creating a new table with CREATE TABLE.
When do sequences go out of sync?
Based on my experience as a senior software developer, there are three reasons why a PostgreSQL sequence can go out of sync. Specifically, this happens when:
Importing many rows with an INSERT script or restoring a large database;
Manually setting the wrong value of a sequence with the
setval()
function;Inserting a new record into a table by manually specifying the
id
field in theINSERT
query.
The last one is the most common case and happens because PostgreSQL uses a sequence and automatically updates its value only when you omit the id
field or use the DEFAULT
keyword in the INSERT
query.
How to identify this issue?
Spotting this issue is easy. When you specify no value for the id column and manually launch an INSERT
query or let your ORM (Object-Relational Mapping) do it for you, PostgreSQL will always return a "duplicate key value violates unique constraint" error. Also, this will not happen when performing UPDATE
queries.
As you can imagine, the problem lies in the fact that the sequence related to the id
column went out of sync. Specifically, the error happens when the sequence returns a value for the id
column that is already in use. And this leads to the aforementioned error because an ID must be unique by definition.
How to solve the problem
It is not time to see how to address the out-of-sync problem and solve it once and for all.
Single table solution
Suppose you want to fix the sequence associated with the id
column of your Users table. You can achieve this by launching the following query:
SELECT SETVAL('public."Users_id_seq"', COALESCE(MAX(id), 1)) FROM public."Users";
Such query will take care of updating the Users_id_seq sequence by setting its current value to the result of COALESCE(MAX(id), 1). Notice how the name of the sequences in PostgreSQL follows this notation:
`"<table-name>_<column-name>_seq"`
The COALESCE
function returns the first non-null value, and it is required because if Users
was empty, MAX(id)
would return NULL
. So, by using COALESCE
you are sure that the value assigned to Users_id_seq will be MAX(id)
when Users is not null, and 1 when Users is null. In both cases, that query sets the desired value.
Fixing all your sequences with one script
If you wanted to fix all your sequences with one query, you could use the following script coming from the official PostgreSQL Wiki:
SELECT 'SELECT SETVAL(' ||
quote_literal(quote_ident(PGT.schemaname) || '.' || quote_ident(S.relname)) ||
', COALESCE(MAX(' ||quote_ident(C.attname)|| '), 1) ) FROM ' ||
quote_ident(PGT.schemaname)|| '.'||quote_ident(T.relname)|| ';'
FROM pg_class AS S,
pg_depend AS D,
pg_class AS T,
pg_attribute AS C,
pg_tables AS PGT
WHERE S.relkind = 'S'
AND S.oid = D.objid
AND D.refobjid = T.oid
AND D.refobjid = C.attrelid
AND D.refobjsubid = C.attnum
AND T.relname = PGT.tablename
ORDER BY S.relname;
When executed, this query returns the set of queries required to fix each of your sequences. As stated in the Wiki, you should use this query as follows:
Save the query in a
fix_sequences.sql
fileRun the query contained in the
fix_sequences.sql
file and store the result in a temp file. Then, run the queries contained in the temp file. Finally, delete the temp file. You can achieve this with the following three commands:
psql -Atq -f fix_sequences.sql -o temp
psql -f temp
rm temp
How to verify that the problem was fixed
Now, all you have to do to verify that your sequence is no longer out-of-sync is to insert a new record in the same table where you originally experienced the issue.
Remember to give the id
column the DEFAULT
value or omit it entirely in the INSERT
query. This way, PostgreSQL will use the "Users_id_seq" sequence behind the scene to retrieve the right value to give to id
.
For example, launch the following query, and you should no longer receive the "duplicate key value violates unique constraint" error message:
INSERT INTO "Users"("id", "name", "surname", "email", "password")
VALUES (DEFAULT, "Jennifer", "Jones", "jennifer.jones@email-domain.com", "pBHxe*cWnC2ZJKHw");
Instead, this query will insert a new record in the Users table as expected.
Conclusion
PostgreSQL is undoubtedly a great RDBMS. At the same time, it has a few issues that can waste your time. This is especially true if you are not aware of them, do not know how to identify them, and then address them accordingly. The out-of-sync issue on sequences is tricky because it leads to a "duplicate key value violates unique constraint" message error, which is generally associated with other problems. So, knowing it is very crucial, and here you learned why it occurs, how to detect it, and how to fix it.
The post "How to fix a sequence when it goes out of sync in PostgreSQL" appeared first on Writech.
Posted on January 19, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.