CockroachDB - creating limited key ranges
Jim Hatcher
Posted on August 30, 2021
In CockroachDB, when choosing keys for your tables' primary keys, the recommendation is to use UUID-based keys because they are globally unique and because they create non-sequential values (which help to "spread" the data into different ranges.)
Often, however, if you're migrating an application to CockroachDB from a different database where the old database schema was using int-based keys, it is often more palatable to keep the int-based keys to avoid application changes.
A nice compromise can be to keep the int-based types but move away from generating the IDs via sequences and instead use the unique_rowid() function to create unique values. The advantage of doing this is that values generated in this manner are more spread out and help to avoid write hotspots.
In CockroachDB, you can create various integer-based values (https://www.cockroachlabs.com/docs/stable/int.html), but typically for id field, you would consider the ones with larger ranges:
- INT4 - four-byte integers with a range between -2^31 through 2^31 (around negative 2 billion to positive 2 billion)
- INT8 - eight-byte integers with a range between -2^63 through 2^63 (around negative 9 quintillion to positive 9 quintillion)
The unique_rowid() function produces INT8-based values. And, since it is responsible for creating unique values, it needs to create large values. You can easily imagine having a table in a system that has more than 4 billion records and would exceed the range of int4-based values.
This brings up another wrinkle which folks sometimes run into when migrating to CockroachDB. Sometimes, while migrating, folks need to move from int4-based values (with id generation happening via sequences) to int8-based fields (with id generation happening via the unique_rowid() function). When they do this, they can run into an interesting limitation in Javascript (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER) which is that values greater than 2^53 can become truncated. You can work around this by passing the integer-based values as strings which allows them to keep their precision. But, this may require a lot of code changes.
Here is a workaround that can be done at the DB-level which uses an int8-based field but makes sure that the values contained in the field are less than 2^53 so they can flow through the Javascript/JSON API levels with minimal impact:
create table modtest ( i8 int, i7 int8 as (mod(i8, 9007199254740991)) stored PRIMARY KEY ); -- this big number is the max int size in javascript
insert into modtest ( i8 ) values ( unique_rowid() );
root@:26257/test> select * from modtest;
i8 | i7
---------------------+-------------------
682240290961522689 | 6700346855948364
Notice that the i7 field is the primary key and can be used as the value to uniquely identify the record through the levels of the system (API, etc.). The i8 field is in the table but is strictly used as a way to create values.
Posted on August 30, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.