Quick Links

JSONB Array of Strings (with GIN index) versus Split Rows (B-Tree Index)

From:	Syed Jafri <syed(dot)jafri2(at)ucalgary(dot)ca>
To:	"pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject:	JSONB Array of Strings (with GIN index) versus Split Rows (B-Tree Index)
Date:	2019-02-04 05:34:57
Message-ID:	0F6A6504-F1DE-408D-9A8F-FA7F3B8C38B1@ucalgary.ca
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

I have a database which stores receiver to indicate which account the data relates to. This has led to tons of duplication of data, as one set of data may create 3 separate rows, where the only difference is the receiver column.

|---------------------|------------------|---------------------|------------------|
| Receiver | Event | Date | Location |
|---------------------|------------------|---------------------|------------------|
| Alpha | 3 | 12 | USA |
|---------------------|------------------|---------------------|------------------|
| Bravo | 3 | 12 | USA |
|---------------------|------------------|---------------------|------------------|
| Charlie | 3 | 12 | USA |
|---------------------|------------------|---------------------|------------------|

While redesigning the database, I have considered using an array with a GIN index instead of the current B-Tree index on receiver. My proposed new table would look like this:
|-------------------------------|--------------|------------|-------------------|
| Receivers | Event | Date | Location |
|-------------------------------|--------------|------------|-------------------|
| ["Alpha", "Bravo", "Charlie"] | 3 | 12 | USA |
|-------------------------------|--------------|------------|-------------------|

More Information:
· Receiver names are of the type (a-z, 1-5, .)
· 95% of all queries currently look like this: SELECT * FROM table WHERE Receiver = Alpha, with the new format this would be SELECT * FROM table WHERE receivers @> '"Alpha"'::jsonb;
· The table currently contains over 4 billion rows (with duplication) and the new proposed schema would cut it down to under 2 billion rows.
·
Question:
1. Does it make more sense to use Postgres Native Text Array?
2. Would a jsonb_path_ops GIN index on receivers make sense here?
3. Which option is more efficient? Which is faster?

Responses

Re: JSONB Array of Strings (with GIN index) versus Split Rows (B-Tree Index) at 2019-02-04 21:44:25 from Ian Zimmerman
Re: JSONB Array of Strings (with GIN index) versus Split Rows (B-Tree Index) at 2019-02-04 22:02:44 from David G. Johnston

Browse pgsql-general by date

	From	Date	Subject
Next Message	rob stone	2019-02-04 05:57:10	Re: Server goes to Recovery Mode when run a SQL
Previous Message	Michael Paquier	2019-02-04 01:07:09	Re: Server goes to Recovery Mode when run a SQL