Quick Links

Indexes for hashes

From:	Ivan Voras <ivoras(at)gmail(dot)com>
To:	postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject:	Indexes for hashes
Date:	2016-06-15 09:34:18
Message-ID:	CAF-QHFULmVOdrqwtR7AKRnnx6=GbAW7S6v6f4jACEOVENef7NA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hi,

I have an application which stores a large amounts of hex-encoded hash
strings (nearly 100 GB of them), which means:

- The number of distinct characters (alphabet) is limited to 16
- Each string is of the same length, 64 characters
- The strings are essentially random

Creating a B-Tree index on this results in the index size being larger than
the table itself, and there are disk space constraints.

I've found the SP-GIST radix tree index, and thought it could be a good
match for the data because of the above constraints. An attempt to create
it (as in CREATE INDEX ON t USING spgist(field_name)) apparently takes more
than 12 hours (while a similar B-tree index takes a few hours at most), so
I've interrupted it because "it probably is not going to finish in a
reasonable time". Some slides I found on the spgist index allude that both
build time and size are not really suitable for this purpose.

My question is: what would be the most size-efficient index for this
situation?

Responses

Re: Indexes for hashes at 2016-06-15 12:45:49 from Torsten Zuehlsdorff
Re: Indexes for hashes at 2016-06-15 13:03:07 from ktm@rice.edu
Re: Indexes for hashes at 2016-06-15 13:38:33 from hubert depesz lubaczewski
Re: Indexes for hashes at 2016-06-17 03:51:03 from Claudio Freire

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Glyn Astill	2016-06-15 10:14:27	Re: Clarification on using pg_upgrade
Previous Message	Tory M Blue	2016-06-14 21:08:15	Re: Clarification on using pg_upgrade