Exploring hstore, PostgreSQL’s simple key-value store
One of the highlights of PostgreSQL is its versatility, especially when you need a flexible data model. In this post, we’ll explore hstore, PostgreSQL’s simple key-value store, and take a look at how it works using a catalog of books.
PostgreSQL comes with a lot of features that makes working with unstructured and semi-structured data easy. hstore is a PostgreSQL extension that implements the
hstore data type. It’s a key-value data type for PostgreSQL that’s been around since before JSON and JSONB data types were added. You can use it anywhere you need to store simple key-value pairs but don’t want to add an extra column (and run a migration) for every new key added. Since it supports GIN and GiST indexing, lookup is fast and memory efficient.
hstore can be used with, rather than a replacement for, the
JSONB data types. You might consider
hstore if you have ancillary data that doesn’t fit into your relational columns, isn’t frequently looked up, or doesn’t have a nested structure. For example, if we were talking about books in a bookstore, you could store the information that pertains to all books (ISBN, title, etc) as table columns, while storing simpler and less-accessed data such as descriptions of pictures, themes, or whether the book is part of a larger series in an
In this example, we’ll show you how to use
hstore by creating a catalog of books. Using an
hstore column of book attributes, we’ll look at how to add, delete, and edit this data and how to query for book attributes using its specialized operators and functions.
So, let’s get started by setting up the catalog of books.
The book catalog
The catalog that we’ll create comprises books having a title, author/editor, year, publisher, and an attribute column. The attribute columns contain key-value pairs that include ancillary information about a book. The ancillary data about each book is not standardized across the catalog and includes information like the number of pages, subject type, and its physical contents.
In order to use
hstore, we have to install the extension. The extension comes with IBM Cloud Databases for PostgreSQL. To install it, run the following from the psql shell:
CREATE EXTENSION hstore;
Next, we’ll need a table to store the catalog of books. For that, we’ll set up a table like the following:
CREATE TABLE books ( id SERIAL PRIMARY KEY, title TEXT, author_editor TEXT, year INT, publisher TEXT, attributes hstore );
Notice that the
attributes column uses
hstore. Once you create the table, you can confirm that the column has indeed been created by using the psql metacommand
\d books, where we’ll have a table like this:
With the table set up, we can add some books to the table:
INSERT INTO books (title, author_editor, year, publisher, attributes) VALUES ( 'The Iliad','Homer',1998,'Oxford UP', 'category => "mythology", pages => "446", translation => "Robert Fitzgerald", introduction => "G. S. Kirk"' ), ( 'How to Read Egyptian Hieroglyphs','Mark Collier',1998,'University of California Press', 'category => "language", binding => "hardback", pages => "179"' ), ( 'Beginner''s Latin','G. D. A. Sharpley',2003,'McGraw-Hill', 'category => "language learning", language => "Latin", pages => "232", includes => "CD"' ), ( 'Game of Thrones','George R. R. Martin',1996,'Bantam', 'category => "fiction", pages => "694", binding => "hardback", includes => "maps"' ), ( 'Old and Middle English: An Anthology','Elaine Treharne',2000,'Blackwell Publishers', 'category => "anthology", language => "old english", language_2 => "middle english", pages => "622", translation => "Elaine Treharne"' ), ( 'History of Ancient Egypt','Erik Hornung',1999,'Cornell UP', 'category => "history", pages => "185", translation => "David Lorton", translation_language => "German", binding => "paper"' ), ( 'The Making of Textual Culture: Grammatica and Literary Theory 350 - 1100','Martin Irvine',1994,'Cambridge UP', 'category => "history", category_2 => "literary theory", pages => "604", series => "Cambridge Studies in Medieval Literature", series_num => "19"' ), ( 'Beowulf: A Verse Translation','Daniel Donoghue',2002,'W. W. Norton and Co.', 'category => "literature", pages => "256", series => "Norton Critical Edition", translation => "Seamus Heaney"' ), ( 'A History of the Vikings','Gwyn Jones',1987,'Oxford UP', 'category => "history", pages => "504", includes => "maps and plates"' ), ( 'PostGIS in Action','Regina O. Obe and Leo S. Hsu',2015,'Manning','' );
Now that you have 10 books in the catalog before we start querying, let’s show you how to set up an index on the
Creating an index
After adding the books to the catalog, we should set up an index on the
hstore column to reduce the time it takes for PostgreSQL to look for particular keys and values.
hstore supports GIN and GiST indexing with
?| operators and hash or btree indexes with the
There isn’t a special way to create an index on an
hstore column; you just have to index the entire column, not individual keys or values. To create a GIN index on the
attributes column, we’d run the following SQL command:
CREATE INDEX idx_book_attrs ON books USING GIN(attributes);
And that’s all to it. Having an index set up, let’s look at how we can work with our
Adding, deleting, and editing hstore data
attributes column is where the
hstore data lives. As you can see, the key-value pairs are stored in a string like this:
'category => "history", pages => "504", includes => "maps and plates"'
If we look at the last book in the catalog “PostGIS in Action,” we’ll notice that it doesn’t have attributes set up—it’s just an empty column. To add some attributes to the book, we use the hstore function
hstore(...)to update the
UPDATE books SET attributes = hstore(ARRAY[['pages', '570'],['category','gis'],['includes','code samples'],['media', 'ebook']]) WHERE id = 10;
This example takes a two-dimensional array that includes the keys and values we want to add to the book. The
hstore function will then add those values to the column. Updating the column for “PostGIS in Action” using that function will give us the following:
To add to or update the attributes of books that already have values in the
hstore column, we use the same
UPDATE SQL command. For example, if we wanted to add on a “binding” attribute to the book “A History of the Vikings,” we could do that using the
hstore concatenate operator
|| like this:
UPDATE books SET attributes = attributes || 'binding=>"paper"' WHERE id = 9;
attributes ------------------------------------------------------------------------------------------ "pages"=>"504", "binding"=>"paper", "category"=>"history", "includes"=>"maps and plates"
Updating the “includes” attribute, we use the same SQL command:
UPDATE books SET attributes = attributes || 'includes=>"maps, plates, and pictures"' WHERE id = 9;
Now giving us:
attributes ----------------------------------------------------------------------------------------------------- "pages"=>"504", "binding"=>"paper", "category"=>"history", "includes"=>"maps, plates, and pictures"
It’s also easy to delete an attribute. For example, let’s delete “category_2” from “The Making of Textual Culture.” In the SQL query, we only need to define the
attributes column and key.
UPDATE books SET attributes = delete(attributes,'category_2') WHERE id = 7;
The book’s attributes are now as follows:
attributes ----------------------------------------------------------------------------------------------------------------- "pages"=>"604", "series"=>"Cambridge Studies in Medieval Literature", "category"=>"history", "series_num"=>"19"
Now that adding, editing, and deleting attributes is taken care of, let’s focus on querying data in the
hstorecolumn. As previously mentioned,
hstore has a variety of operators we can use to get keys and values.
The most common operator for searching through
->, which we can use to look up a specific key. For example, if we want to look for all the “category” keys with the value “history”, we’d write something like this:
SELECT title, author_editor, year FROM books WHERE attributes -> 'category' = 'history';
This will give us:
Using other operators, we could also create a query that looks for a particular key in each row instead of values. For example, searching for books that contain the key “language,” we’d use the
? operator that returns either true or false. Therefore, creating an SQL command for that would look like this:
SELECT title, author_editor, attributes FROM books WHERE attributes ? 'language';
This will give us two books:
If we are ambitious and want to check for more than one attribute, we might change the previous query and add the
?| operator. This operator checks whether any of the attributes occur in the books. Using the
ARRAY data type, we’ll include the keys we want to search for.
SELECT title, author_editor, attributes FROM books WHERE attributes ?| ARRAY['translation', 'binding'];
This gives us the following books:
We can also get a little creative with the queries and return all the books over 300 pages that have “category” as “history.” Since the “pages” key is text, we’ll have to cast it as an integer first.
SELECT title, year, attributes -> 'pages' AS pages FROM books WHERE (attributes -> 'pages')::integer > 300 AND attributes -> 'category' = 'history';
This query will give us two books:
Casting the string value to an integer is not the optimal way you should approach this problem, so you might consider placing pages either in its own column, or just going forth and converting everything to JSONB.
Converting to JSONB
If you still would like to convert your data to JSON,
hstore has functions to do that. The functions can convert the attributes into a matrix, array, JSON, or JSONB. For instance, if we want the output of the previous query as JSONB, we’d change it to the following:
SELECT title, author_editor, hstore_to_jsonb(attributes) FROM books WHERE attributes ?| ARRAY['translation', 'binding'];
Note that we use the
hstore_to_json() function and place the
attributes column inside. The results of the query look like this:
As you can see,
hstore successfully converted the results to JSONB, which you could put in another table to query using JSONB operators and functions. With
hstore, flexibility is king.
This is a simple introduction to
hstore in PostgreSQL. While it’s not as versatile as JSON or JSONB since they are fully-fledged document store types, it can be easily incorporated into new or old tables when you don’t want to create an entirely new set of columns for data that is not looked up frequently. While we suggest using JSONB for its versatility and power, we also want share with you alternatives to explore, especially some of the lesser used features of PostgreSQL.
Enjoyed this article? Get started with Databases for PostgreSQL now.
Databases for PostgreSQL is a fully managed, enterprise-ready PostgreSQL service with built-in security, high availability, and backup orchestration. Learn more here.