Re: CLUSTER and indisclustered

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CLUSTER and indisclustered
Date: 2002-08-04 02:55:59
Message-ID: 200208040255.g742txn24484@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> writes:
> > It occured to me on the plane home that now that CLUSTER is fixed we may
> > be able to put pg_index.indisclustered to use. If CLUSTER was to set
> > indisclustered to true when it clusters a heap according to the given
> > index, we could speed up sequantial scans.
>
> AFAICT you're assuming that the table is *exactly* ordered by the
> clustered attribute. While this is true at the instant CLUSTER
> completes, the exact ordering will be destroyed by the first insert or
> update :-(. I can't see much value in creating a whole new scan type
> that's only usable on a perfectly-clustered table.
>
> The existing approach to making the planner smart about clustered tables
> is to compute a physical-vs-logical-order-correlation statistic and use
> that to adjust the estimated cost of indexscans. I believe this is a
> more robust approach than considering a table to be "clustered" or "not
> clustered", since it can deal with the gradual degradation of clustered
> order over time. However, I will not make any great claims for the
> specific equations currently used for this purpose --- they're surely in
> need of improvement. Feel free to take a look and see if you have any
> ideas. The collection of the statistic is in commands/analyze.c and the
> use of it is in optimizer/path/costsize.c.

Tom, should we be updating that flag after we CLUSTER instead of
requiring an ANALYZE after the CLUSTER?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-08-04 02:57:33 Re: CLUSTER and indisclustered
Previous Message Gavin Sherry 2002-08-04 02:55:55 Re: CLUSTER and indisclustered