Unique index on hash of jsonb value - correct solution?

From: Albrecht Dreß <albrecht(dot)dress(at)arcor(dot)de>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Unique index on hash of jsonb value - correct solution?
Date: 2020-05-18 16:43:14
Message-ID: RHKB2EIQ.B7PUNVXM.MTKKMOHG@B6UPJ7T6.RKJUGJDU.XX7XFT3J
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,

in a database I have a table with a text, a jsonb and a bytea column, which together shall be unique, like:

<snip>
Column | Type | Collation | Nullable | Default
---------+--------+-----------+----------+-------------------------------------
id | bigint | | not null | nextval('demotab_id_seq'::regclass)
textval | text | | not null |
jsonval | jsonb | | not null |
blobval | bytea | | not null |
Indexes:
"demo_idx" UNIQUE, btree (textval, jsonval, blobval)
</snip>

This seems to work just fine in most cases, but adding a bigger json value (the text and the bytea columns are always short) results in a “index row size 2840 exceeds maximum 2712 for index "…"” error. Following the hint in the error message, I replaced the index by

<snip>
Indexes:
"demo_idx" UNIQUE, btree (textval, md5(jsonval::text), blobval)
</snip>

which seems to solve the issue.

My question: is this approach (a) correct and (b) still safe if the items in the jsonb (always a dict in my case) are re-ordered? I tested a few cases, and trying to insert something like e.g. '{"a":1,"b":2}' and '{"b":2,"a":1}' actually does produce the same hash (i.e. the 2nd insert attempt is rejected due to the index), but is this guaranteed by design for every case? Or is there a better solution for this use case?

Thanks in advance,
Albrecht.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Don Seiler 2020-05-19 01:45:07 Re: template0 needing vacuum freeze?
Previous Message Andrus 2020-05-18 08:11:44 Re: Hot and PITR backups in same server