Quick Links

Re: CLUSTER and indisclustered

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: CLUSTER and indisclustered
Date:	2002-08-04 02:45:51
Message-ID:	6392.1028429151@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> writes:
> It occured to me on the plane home that now that CLUSTER is fixed we may
> be able to put pg_index.indisclustered to use. If CLUSTER was to set
> indisclustered to true when it clusters a heap according to the given
> index, we could speed up sequantial scans.

AFAICT you're assuming that the table is *exactly* ordered by the
clustered attribute. While this is true at the instant CLUSTER
completes, the exact ordering will be destroyed by the first insert or
update :-(. I can't see much value in creating a whole new scan type
that's only usable on a perfectly-clustered table.

The existing approach to making the planner smart about clustered tables
is to compute a physical-vs-logical-order-correlation statistic and use
that to adjust the estimated cost of indexscans. I believe this is a
more robust approach than considering a table to be "clustered" or "not
clustered", since it can deal with the gradual degradation of clustered
order over time. However, I will not make any great claims for the
specific equations currently used for this purpose --- they're surely in
need of improvement. Feel free to take a look and see if you have any
ideas. The collection of the statistic is in commands/analyze.c and the
use of it is in optimizer/path/costsize.c.

regards, tom lane

In response to

CLUSTER and indisclustered at 2002-08-04 01:43:31 from Gavin Sherry

Responses

Re: CLUSTER and indisclustered at 2002-08-04 02:55:55 from Gavin Sherry
Re: CLUSTER and indisclustered at 2002-08-04 02:55:59 from Bruce Momjian

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2002-08-04 02:52:35	Re: PITR, checkpoint, and local relations
Previous Message	Tom Lane	2002-08-04 02:19:36	Re: getpid() function