From: | Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de> |
---|---|
To: | Ivan Voras <ivoras(at)gmail(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Indexes for hashes |
Date: | 2016-06-15 12:45:49 |
Message-ID: | 2d87ea8e-9105-682e-fec5-b355f09d8397@toco-domains.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello Ivan,
> I have an application which stores a large amounts of hex-encoded hash
> strings (nearly 100 GB of them), which means:
>
> * The number of distinct characters (alphabet) is limited to 16
> * Each string is of the same length, 64 characters
> * The strings are essentially random
>
> Creating a B-Tree index on this results in the index size being larger
> than the table itself, and there are disk space constraints.
>
> I've found the SP-GIST radix tree index, and thought it could be a good
> match for the data because of the above constraints. An attempt to
> create it (as in CREATE INDEX ON t USING spgist(field_name)) apparently
> takes more than 12 hours (while a similar B-tree index takes a few hours
> at most), so I've interrupted it because "it probably is not going to
> finish in a reasonable time". Some slides I found on the spgist index
> allude that both build time and size are not really suitable for this
> purpose.
>
> My question is: what would be the most size-efficient index for this
> situation?
It depends on what you want to query. What about the BRIN-Index:
https://www.postgresql.org/docs/9.5/static/brin-intro.html
This will result in a very small size, but depending on what you want to
query it will fit or not fit your needs.
Greetings,
Torsten
From | Date | Subject | |
---|---|---|---|
Next Message | ktm@rice.edu | 2016-06-15 13:03:07 | Re: Indexes for hashes |
Previous Message | Glyn Astill | 2016-06-15 10:14:27 | Re: Clarification on using pg_upgrade |