Quick Links

Re: Minmax indexes

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Minmax indexes
Date:	2014-07-14 19:56:30
Message-ID:	CA+TgmoZUf2g1Fen3bq2uNLAJafCeU9K-vEnwmPk0Xe9FZuqKmA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Jul 9, 2014 at 5:16 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> The way it works now, each opclass needs to have three support
> procedures; I've called them getOpers, maybeUpdateValues, and compare.
> (I realize these names are pretty bad, and will be changing them.)
> getOpers is used to obtain information about what is stored for that
> data type; it says how many datum values are stored for a column of that
> type (two for sortable: min and max), and how many operators it needs
> setup. Then, the generic code fills in a MinmaxDesc(riptor) and creates
> an initial DeformedMMTuple (which is a rather ugly name for a minmax
> tuple held in memory). The maybeUpdateValues amproc can then be called
> when there's a new heap tuple, which updates the DeformedMMTuple to
> account for the new tuple (in essence, it's a union of the original
> values and the new tuple). This can be done repeatedly (when a new
> index is being created) or only once (when a new heap tuple is inserted
> into an existing index). There is no need for an "aggregate".
>
> This DeformedMMTuple can easily be turned into the on-disk
> representation; there is no hardcoded assumption on the number of index
> values stored per heap column, so it is possible to build an opclass
> that stores a bounding box column for a geometry heap column, for
> instance.
>
> Then we have the "compare" amproc. This is used during index scans;
> after extracting an index tuple, it is turned into DeformedMMTuple, and
> the "compare" amproc for each column is called with the values of scan
> keys. (Now that I think about this, it seems pretty much what
> "consistent" is for GiST opclasses). A true return value indicates that
> the scan key matches the page range boundaries and thus all pages in the
> range are added to the output TID bitmap.

This sounds really great. I agree that it needs some renaming. I
think renaming what you are calling "compare" to "consistent" would be
an excellent idea, to match GiST. "maybeUpdateValues" sounds like it
does the equivalent of GIST's "compress" on the new value followed by
a "union" with the existing summary item. I don't think it's
necessary to separate those out, though. You could perhaps call it
something like "add_item".

Also, FWIW, I liked Peter's idea of calling these "summarizing
indexes" or perhaps "summary" would be a bit shorter and mean the same
thing. "minmax" wouldn't be the end of the world, but since you've
gone to the trouble of making this more generic I think giving it a
more generic name would be a very good idea.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Minmax indexes at 2014-07-09 21:16:19 from Alvaro Herrera

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Dunstan	2014-07-14 20:39:40	Re: returning SETOF RECORD
Previous Message	Robert Haas	2014-07-14 19:44:45	returning SETOF RECORD