Quick Links

Re: bitmap scans, btree scans, and tid order

From:	"Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: bitmap scans, btree scans, and tid order
Date:	2005-05-16 17:47:53
Message-ID:	1116265673.9920.18.camel@toonses.gghcwest.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, 2005-05-16 at 09:53 -0400, Tom Lane wrote:
> Jeffrey Baker <jwbaker(at)acm(dot)org> writes:
> > Change the planner/executor to use the bitmap scan in all cases where
> > index order is unimportant. From my reading of the current code, the
> > bitmap scan is only used in case of a join.
>
> This is a fallacy, and I think your concern is largely mistaken. Have
> you experimented with the cases you are worried about?

Perhaps I have not stated the problem clearly. Believe me, I have
experimented extensively with this problem. So here's the statement of
the problem: during an index scan, the heap is visited in index order,
which can be and frequently is random order on the disk. An index scan
that currently takes 15 minutes on my database takes only 5 when the
tuples are fetched in strict disk order, and takes 8 minutes if the
tuples are fetched in disk order in groups of 1000. The problem exists
when the scan is expected to return a lot of tuples, but the planner
chooses an index scan anyway. In many cases, sequential scan would be
faster than index scan + random heap visits.

> It's entirely possible that the current cost model needs adjustment to
> make the planner pick the bitmap scan in more cases. However, it is
> easy to demonstrate (either by thought-experiment or actual trial) that
> a bitmap scan isn't necessarily a win.

I agree. There's certainly a threshold below which the bitmap would not
make sense. It's also possible that changing the btree scan to work in
groups of tuples instead of single tuples would make more sense, which
is why I ventured two different solution to the problem.

> The overhead of maintaining the
> bitmap is far from zero, and you don't get anything unless the resulting
> pattern of heap page fetches is significantly cleaner than before.

Bitmaps scans naturally come back in TID order. I realize this isn't
1:1 correspondence with disk order, but it's a good start. I also like
the way the index scan and heap scan are decoupled in the bitmap code.
It leaves room for more serious optimization of disk access, which is
relatively difficult in the synchronous, one-at-a-time btree code.

> Therefore, a patch that eliminates cost-estimation in favor of trying to
> force one choice or the other is definitely not likely to be received
> favorably ...

That's not the idea.

-jwb

In response to

Re: bitmap scans, btree scans, and tid order at 2005-05-16 13:53:30 from Tom Lane

Responses

Re: bitmap scans, btree scans, and tid order at 2005-05-16 18:35:19 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Josh Berkus	2005-05-16 17:56:38	Re: pgFoundry
Previous Message	Andrew Dunstan	2005-05-16 17:23:46	Re: pgFoundry