Re: CLUSTER and indisclustered

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: mark Kirkwood <markir(at)slithery(dot)org>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CLUSTER and indisclustered
Date: 2002-08-07 04:55:41
Message-ID: Pine.NEB.4.44.0208071351440.1214-100000@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 7 Aug 2002, Tom Lane wrote:

> I thought to myself "didn't I just post something about that?"
> and then realized it was on a different mailing list. Here ya go
> (and no, this is not the first time around on this list either...)

Wow. I'm glad to see you looking at this, because this feature would so
*so* much for the performance of some of my queries, and really, really
impress my "billion-row-database" client.

> The idea is that you don't scan the index and base table concurrently
> as we presently do it. Instead, you scan the index and make a list
> of the TIDs of the table tuples you need to visit.

Right.

> Also, the main downside of this approach is that the bitmap could
> get large --- but you could have some logic that causes you to fall
> back to plain sequential scan if you get too many index hits.

Well, what I was thinking of, should the list of TIDs to fetch get too
long, was just to break it down in to chunks. If you want to limit to,
say, 1000 TIDs, and your index has 3000, just do the first 1000, then
the next 1000, then the last 1000. This would still result in much less
disk head movement and speed the query immensely.

(BTW, I have verified this emperically during testing of random read vs.
random write on a RAID controller. The writes were 5-10 times faster
than the reads because the controller was caching a number of writes and
then doing them in the best possible order, whereas the reads had to be
satisfied in the order they were submitted to the controller.)

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-08-07 05:12:28 Re: CLUSTER and indisclustered
Previous Message Joe Conway 2002-08-07 04:49:38 Re: Open 7.3 items