Re: new index type with clustering in mind.

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Jack Douglas <jack(at)douglastechnology(dot)co(dot)uk>
Cc: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'Martijn van Oosterhout' <kleptog(at)svana(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: new index type with clustering in mind.
Date: 2014-12-10 21:03:02
Message-ID: 20141210210301.GB1768@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Jack Douglas wrote:
> > If the values are perfectly clustered, the index is optimal because you
> scan the minimal set of pages.
>
> That's the bit I'm particularly interested in, as my plan would be to keep
> the pages well clustered: http://dba.stackexchange.com/a/66293/1396
>
> Do you see any blocker preventing BRIN being used for a continuous
> background re-clustering job (in parallel with or as part of vacuum),
> similar to the mechanism I experimented with before? If not is this
> something there might be support for adding to the TODO list?

In principle, CLUSTER sucks, and having data clustered is clearly good,
so improvements in this area are certainly welcome.

If you were to propose some general mechanism that works for any index,
I don't see that we would reject having it work specifically for BRIN;
having it for BRIN only would be strange. (I guess it's good enough if
it works for btree and BRIN. Not sure about GiST and SP-GiST; GIN
clearly is of no use here.)

Currently, one issue you're going to face is that brin doesn't rescan a
range to find the tighest possible summary tuple. Thinking in min/max
terms, once a tuple with a very high or very low is inserted, the range
doesn't get any smaller once that tuple is deleted from the range. You
would need to find a way to fix that. (The easiest way is to REINDEX
the whole thing, of course, but that processes the whole table and not
just some portion of it.)

Another issue is how to find the best possible ordering. For minmax
opclasses it's easy, but for other opclass designs it's less clear what
to do. Even for minmax you need to find some way to communicate to the
system what's the order to follow ...

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2014-12-10 23:06:28 Re: ALTER TYPE ADD SEND AND RECEIVE
Previous Message Manuel Kniep 2014-12-10 20:44:39 ALTER TYPE ADD SEND AND RECEIVE