Re: heap_inplace_lock vs. autovacuum w/ LOCKTAG_TUPLE

From: Noah Misch <noah(at)leadboat(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap_inplace_lock vs. autovacuum w/ LOCKTAG_TUPLE
Date: 2024-10-27 21:40:35
Message-ID: 20241027214035.8a.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 27, 2024 at 08:00:00AM +0300, Alexander Lakhin wrote:
> 27.10.2024 07:09, Noah Misch wrote:
> > On Sat, Oct 26, 2024 at 11:49:36AM -0700, Noah Misch wrote:
> > > intra-grant-inplace-db.spec got a novel failure today:
> > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sarus&dt=2024-10-26%2014%3A08%3A58
> > >
> > > The isolationtester_waiting query is supposed to detect that step vac2 is
> > > blocked. vac2 didn't finish within the timeout, but isolationtester_waiting
> > > never considered it blocked.
> > > ... work on reproducing this.

The attached demo reproduces $SUBJECT for me. The clue was 'CONTEXT: while
scanning block 0 of relation "pg_catalog.pg_database"'. That
vacuum_error_callback() message indicated VACUUM_ERRCB_PHASE_SCAN_HEAP as the
current phase. That implies vac2 had not reached any inplace update, any
LOCKTAG_TUPLE, or any part of vac_update_datfrozenxid(). I bet vac2 was stuck
in LockBufferForCleanup(). Concurrently, a sequence of autovacuum workers
held a buffer pin blocking that cleanup. The demo makes two changes:

a. Start intra-grant-inplace-db.spec step vac2 just after some autovacuum
worker is blocked on the xid of the uncommitted GRANT. While blocked, the
worker holds a pin on the one pg_database page. vac2 needs
LockBufferForCleanup($THAT_PAGE).
b. Make LockBufferForCleanup() sleep 10ms after WAIT_EVENT_BUFFER_PIN, to give
the next autovacuum worker time to win the pinning race.

LockBufferForCleanup() waits for a pin count to fall to 1. While waiting, its
techniques are less sophisticated than what we have for lock.c heavyweight
locks. UnpinBufferNoOwner() notifies the cleanup waiter, but nothing stops
another backend from pinning before the cleanup waiter reacts. We have code
to cancel an autovacuum for the purpose of liberating a heavyweight lock, but
we don't have code like that to free a pin.
pg_isolation_test_session_is_blocked() doesn't detect buffer pin blockages.

Two fix candidates look most promising:

1. Unpin before heap_inplace_lock() sleeps.
2. Change intra-grant-inplace-db.spec to test freezing only MXIDs, not XIDs.

Let's do (1), as attached. Apart from some arguably-convoluted call stacks,
this has no disadvantages. An earlier version did unpin.
postgr.es/m/20240822073200.4f.nmisch@google.com stopped that, arguing,
"heap_update() doesn't optimize that way". In light of $SUBJECT, I no longer
see heap_update() as the applicable standard. Since autovacuum can run
anytime, it has an extra duty to be non-disruptive. heap_update() during
auto-analyze won't wait much, because analyze takes a self-exclusive table
lock and is the sole writer of stats tables. Inplace updates during
autovacuum are different. They do contend with other transactions. Since we
lack code to cancel an autovacuum to free a pin, a long-lived pin in
autovacuum is more disruptive than a long-lived lock in autovacuum.

While I've not tried it, I expect (2) would work as follows. pg_database
won't contain MXIDs, so lazy_scan_noprune() would approve continuing without
the cleanup lock. Unlike (1), this wouldn't help concurrency outside the
test.

> FWIW, there was a similar failure in August: [1], and I also could not
> reproduce that locally, yet wrote a preliminary analysis at [2] in the
> Unsorted section, in the hope to see it again and continue investigation.
>
> [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=iguana&dt=2024-08-29%2013%3A57%3A57
> [2] https://wiki.postgresql.org/wiki/Known_Buildfarm_Test_Failures

Thanks. I had not known about iguana's failure. What are the chances that
the buffer pin could explain that one?

Attachment Content-Type Size
demo-inplace210-pin-starvation-v1.patch text/plain 3.8 KB
inplace220-unpin-v1.patch text/plain 4.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2024-10-27 21:45:11 Re: race condition in pg_class
Previous Message Yasir 2024-10-27 20:15:19 Re: Alias of VALUES RTE in explain plan