From: | Morris de Oryx <morrisdeoryx(at)gmail(dot)com> |
---|---|
To: | Steven Winfield <Steven(dot)Winfield(at)cantabcapital(dot)com> |
Cc: | Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Questions about btree_gin vs btree_gist for low cardinality columns |
Date: | 2019-06-03 10:11:48 |
Message-ID: | CAKqncci+mtt-_5fdcOiNaxvtJF1ij5_dOTfda1t41mN0yVA=fw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I didn't notice Bloom filters in the conversation so far, and have been
waiting for *years* for a good excuse to use a Bloom filter. I ran into
them years back in Splunk, which is a distributed log store. There's an
obvious benefit to a probabalistic tool like a Bloom filter there since
remote lookup (and/or retrieval from cold storage) is quite expensive,
relative to a local, hashed lookup. I haven't tried them in Postgres.
In the case of a single column with a small set of distinct values over a
large set of rows, how would a Bloom filter be preferable to, say, a GIN
index on an integer value?
I have to say, this is actually a good reminder in my case. We've got a lot
of small-distinct-values-big-rows columns. For example, "server_id",
"company_id", "facility_id", and so on. Only a handful of parent keys with
many millions of related rows. Perhaps it would be conceivable to use a
Bloom index to do quick lookups on combinations of such values within the
same table. I haven't tried Bloom indexes in Postgres, this might be worth
some experimenting.
Is there any thought in the Postgres world of adding something like
Oracle's bitmap indexes?
From | Date | Subject | |
---|---|---|---|
Next Message | Steven Winfield | 2019-06-03 10:33:17 | RE: Questions about btree_gin vs btree_gist for low cardinality columns |
Previous Message | Karsten Hilbert | 2019-06-03 10:03:31 | CREATE DATABASE ... TEMPLATE ... vs checksums |