Avoiding superfluous buffer locking during nbtree backwards scans

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Avoiding superfluous buffer locking during nbtree backwards scans
Date: 2024-07-05 16:47:49
Message-ID: CAH2-Wz=xgs7PojG=EUvhgadwENzu_mY_riNh-w9wFPsaS717ew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached patch teaches nbtree backwards scans to avoid needlessly
relocking a previously read page/buffer at the point where we need to
consider reading the next page (the page to the left).

Currently, we do this regardless of whether or not we already decided
to end the scan, back when we read the page within _bt_readpage. We'll
relock the page we've just read (and just unlocked), just to follow
its left link, while making sure to deal with concurrent page splits
correctly. But why bother relocking the page, or with thinking about
concurrent splits, if we can already plainly see that we cannot
possibly need to find matching tuples on the left sibling page?

The patch just adds a simple precheck, which works in the obvious way.
Seems like this was a missed opportunity for commit 2ed5b87f96.

On HEAD, the following query requires 24 buffer hits (I'm getting a
standard/forward index scan for this):

select
abalance
from
pgbench_accounts
where
aid in (1, 500, 1000, 1500, 2000, 3000) order by aid asc;

However, we don't see that with the backwards scan variant:

select
abalance
from
pgbench_accounts
where
aid in (1, 500, 1000, 1500, 2000, 3000) order by aid desc;

We actually see 30 buffer hits for this on HEAD. Whereas with the
patch, both variants get only 24 buffer hits -- there's parity, at
least in cases like these two.

Note that we only "achieve parity" here because we happen to be using
a SAOP, requiring multiple primitive index scans, each of which ends
with its own superfluous lock acquisition. Backwards scans remain at a
disadvantage with regard to buffer locks acquired in other cases --
cases that happen to involve scanning several sibling leaf pages in
sequence (no change there).

It's probably possible to fix those more complicated cases too. But
that would require a significantly more complicated approach. I
haven't touched existing comments in _bt_readnextpage that contemplate
this possibility.

--
Peter Geoghegan

Attachment Content-Type Size
v1-0001-Avoid-unneeded-nbtree-backwards-scan-buffer-lock.patch application/octet-stream 1.7 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2024-07-05 16:49:45 Re: Restart pg_usleep when interrupted
Previous Message Joel Jacobson 2024-07-05 16:42:45 Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.