Re: ReadRecentBuffer() doesn't scale well

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: ReadRecentBuffer() doesn't scale well
Date: 2023-06-27 04:53:12
Message-ID: CAH2-WznwevAK-mf1BTO9QBPMee_ghSzxheBYLW6Wc5sseAF30A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 26, 2023 at 9:40 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> If the goal is to get rid of both pins and content locks, LSN isn't
> enough. A page might be evicted and replaced by another page that has
> the same LSN because they were modified by the same record. Maybe
> that's vanishingly rare, but the correct thing would be counter that
> goes up on modification AND eviction.

It should be safe to allow searchers to see a version of the root page
that is out of date. The Lehman & Yao design is very permissive about
these things. There aren't any special cases where the general rules
are weakened in some way that might complicate this approach.
Searchers need to check the high key to determine if they need to move
right -- same as always.

More concretely: A root page can be concurrently split when there is
an in-flight index scan that is about to land on it (which becomes the
left half of the split). It doesn't matter if it's a searcher that is
"between" the meta page and the root page. It doesn't matter if a
level was added. This is true even though nothing that you'd usually
think of as an interlock is held "between levels". The root page isn't
really special, except in the obvious way. We can even have two roots
at the same time (the true root, and the fast root).

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2023-06-27 04:57:12 Re: Improving btree performance through specializing by key shape, take 2
Previous Message Thomas Munro 2023-06-27 04:40:08 Re: ReadRecentBuffer() doesn't scale well