From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
---|---|
To: | Claudio Freire <klaussfreire(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Minmax indexes |
Date: | 2014-06-21 18:08:57 |
Message-ID: | 20140621180857.GA24593@svana.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I'm sorry if I missed something, but ISTM this is beginning to look a
lot like GiST. This was pointed out by Robert Haas last year.
On Wed, Jun 18, 2014 at 12:09:42PM -0300, Claudio Freire wrote:
> So, you have:
>
> An aggregate to generate a "compressed set" from several values
Which GiST does by calling 'compress' on each value, and the 'unions' the
results together.
> A function which adds a new value to the "compressed set" and returns
> the new "compressed set"
Again, 'compress' + 'union'
> A function which tests if a value is in a "compressed set"
Which GiST does using 'compress' +'consistant'
> A function which tests if a "compressed set" overlaps another
> "compressed set" of equal type
Which GiST calls 'consistant'
So I'm wondering why you can't just reuse the btree_gist functions we
already have in contrib. It seems to me that these MinMax indexes are
in fact a variation on GiST that indexes the pages of a table based
upon the 'union' of all the elements in a page. By reusing the GiST
operator class you get support for many datatypes for free.
> If you can define different compressed sets, you can use this to
> generate both min/max indexes as well as bloom filter indexes. Whether
> we'd want to have both is perhaps questionable, but having the ability
> to is probably desirable.
You could implement bloom filter in GiST too. It's been discussed
before but I can't find any implementation. Probably because the filter
needs to be parameterised and if you store the bloom filter for each
element it gets expensive very quickly. However, hooked into a minmax
structure which only indexes whole pages it could be quite efficient.
> One problem with such a generalized implementation would be, that I'm
> not sure in-place modification of the "compressed set" on-disk can be
> assumed to be safe on all cases. Surely, for strictly-enlarging sets
> it would, but while min/max and bloom filters both fit the bill, it's
> not clear that one can assume this for all structures.
I think GiST has already solved this problem.
Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
-- Arthur Schopenhauer
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2014-06-21 18:23:44 | Re: idle_in_transaction_timeout |
Previous Message | Kevin Grittner | 2014-06-21 18:06:26 | Re: delta relations in AFTER triggers |