Quick Links

Re: Clock sweep not caching enough B-Tree leaf pages?

From:	Greg Stark <stark(at)mit(dot)edu>
To:	Peter Geoghegan <pg(at)heroku(dot)com>
Cc:	Ants Aasma <ants(at)cybertec(dot)at>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clock sweep not caching enough B-Tree leaf pages?
Date:	2014-04-17 15:10:38
Message-ID:	CAM-w4HMO_Y2NFBMP+PK_Ttm9RwOAD1Brrq9vfxPzb3qkyqLTRQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Apr 15, 2014 at 7:30 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Frankly, there doesn't need to be any research on this, because it's
> just common sense that probabilistically, leaf pages are much more
> useful than heap pages in servicing index scan queries if we assume a
> uniform distribution. If we don't assume that, then they're still more
> useful on average.

I don't think "common sense" is compelling. I think you need to pin
down exactly what it is about btree intermediate pages that the LRU
isn't capturing and not just argue they're more useful. The LRU is
already capturing which pages are more heavily used than others so you
need to identify what it is that makes index pages *even more* useful
than their frequency and recency of access indicates. Not just that
they're more useful than an average page.

So what I think is missing is that indexes are always accessed from
the root down to the leaf. So the most recent page accessed will
always be the leaf. And in whatever chain of pages was used to reach
the last leaf page the least recently accessed will always be the
root. But we'll need the root page again on the subsequent descent
even if it's to reach the same leaf page we kept in ram in preference
to it.

Now it doesn't *always* make sense to keep an intermediate page over
leaf pages. Imagine an index that we always do full traversals of.
We'll always descend from the root down the left-most pages and then
follow the right pointers across. All the other intermediate pages
will be cold. If we do an occasional descent probing for other keys
those leaf pages shouldn't be cached since they won't be needed again
for the common full index traversals and the next occasional probe
will probably be looking for different keys.

But if we're often probing for the same keys the last thing we want to
do is throw away one of the intermediate pages for those keys when we
could throw away a leaf page. But that's what would happen in a strict
LRU. It's almost like what we would really want to do is mark the
pages as least recently used in the opposite order from the order
they're actually accessed when descending. Or perhaps bump the usage
count to max+1 when it's an intermediate page so that it takes one
extra cycle of decrementing before it's considered old compared to a
leaf page.

--
greg

In response to

Re: Clock sweep not caching enough B-Tree leaf pages? at 2014-04-15 23:30:39 from Peter Geoghegan

Responses

Re: Clock sweep not caching enough B-Tree leaf pages? at 2014-04-17 20:33:27 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2014-04-17 15:31:54	Re: Verbose output of pg_dump not show schema name
Previous Message	Fabrízio de Royes Mello	2014-04-17 15:07:39	Re: Verbose output of pg_dump not show schema name