Arctype Team
Posted on October 12, 2021
Introduction to Key-Value Store
We have seen several features in PostgreSQL that goes beyond relational storage, such as:
- JSON
Forget SQL vs NoSQL - Get the Best of Both Worlds with JSON in PostgreSQL
Derek Xiao for Arctype ・ Feb 3 ・ 8 min read
- Geospatial data
Working with geospatial data in Postgres.
Derek Xiao for Arctype ・ Dec 29 '20 ・ 3 min read
-
Fuzzy Search
-
Full-Text Search
This article will explore the key-value store feature, which is less popular, but still plays an integral part in PostgreSQL's NoSQL capabilities.
Setting Up PostgreSQL with Extension and Data
Unlike JSON and JSONB, the key-value store comes as a separate extension called HStore
. We can install it right away as it comes bundled with PostgreSQL installation itself.
Enabling HStore
CREATE EXTENSION HSTORE
This will enable the key-value store extension.
Note: Extension can be enabled only with superuser access. The system will throw an error otherwise.
Creating Table and HStore Columns
Let's create a simple demo table where we maintain scores of people in key-value fashion. To do that, we need to create the data type itself as hstore
.
CREATE TABLE hstore_example (score hstore)
We do not need to specify the key-value structure type (i.e., whether the key is an int
or text
, as the default type will always be text
). Proper datacasting is necessary either at the database level or application level for manipulation. This is similar to the JSON data type, where all of the data is just JSONB
.
Populating Test Data
Inserting data into hstore
columns is pretty straightforward.
INSERT INTO hstore_example values('"Jason" => 100');
INSERT INTO hstore_example values('"Jack" => 200');
INSERT INTO hstore_example values('"Perry" => 150');
Nested data insertion is also possible. But for this example, we are inserting only one level of nesting.
Getting Started with Key-Value Queries
We can query for rows by searching for keys.
SELECT
*
FROM
hstore_example
WHERE
score ? 'Jason';
Get a value for a particular key,
SELECT
score -> 'Jason' as score
FROM
hstore_example
WHERE
score -> 'Jason' is NOT NULL
An exhaustive list of operations can be found in the official PostgreSQL documentation for HStore
-
https://www.postgresql.org/docs/current/hstore.html
There are several powerful operators present that can help us do a variety of data manipulations without having to handle them in application logic. We should always do data manipulation at the database level rather than at the application level for a variety of reasons such as performance, security, etc.,
Indexing for Faster Queries
The HStore
extension supports indexes of the GIN
type. We can create these indexes as follows,
CREATE INDEX hstore_example_idx ON hstore_example USING GIN (score)
This is similar in speed and power to the GIN
index we can create for the JSON type
It can significantly speed up queries, particularly if the keys are nested and searching is not straightforward with normal comparisons. These indexes are considerably smaller than a JSON GIN and can efficiently operate from the memory/cache for quicker lookups.
On the other hand, an index will slow down the writes, so a proper tradeoff has to be considered before we extensively use an index. Benchmarking should be done before taking an application/solution to production.
Internal Structure of HStore
HStore
is internally stored as a varlena
. Similar to JSON, the whole field has to be read from the disk to do any read/modifications. As a result, HStore
is not really optimized compared to traditional key-value stores such as Redis and Memcache.
The documentation about TOAST table structure is also highly recommended on why the reads/writes can only be done as a whole field rather than partial updates,
https://www.postgresql.org/docs/current/storage-toast.html
For this reason, if the K/V structure is complex and nested, we are better off storing it as a proper JSON
or JSONB
type.
Conclusion
Potential use case scenarios can be similar but not limited to the below examples,
- Maintaining a cache.
- Fast read/update scenarios such as maintaining an API rate limiter.
- Maintaining scores/simple time-series data.
HStore
is not a replacement for Key-Value stores; they are much more optimized for different use cases. But as already mentioned, these features exist in PostgreSQL because there are many situations where using a separate data store would be an overkill and a maintenance issue. We can use PostgreSQL's NoSQL capabilities as a bridge to fill that gap. If the business expands, we can migrate to a different database. Still, until then, we can comfortably use PostgreSQL, which provides ACID-compliant NoSQL features without the additional overhead of maintaining a separate database.
Posted on October 12, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.