From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Checkpointer sync queue fills up / loops around pg_usleep() are bad |
Date: | 2022-02-26 21:39:42 |
Message-ID: | 20220226213942.nb7uvb2pamyu26dj@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
In two recent investigations in occasional test failures
(019_replslot_limit.pl failures, AIO rebase) the problems are somehow tied to
checkpointer.
I don't yet know if actually causally related to precisely those failures, but
when running e.g. 027_stream_regress.pl, I see phases in which many backends
are looping in RegisterSyncRequest() repeatedly, each time sleeping with
pg_usleep(10000L).
Without adding instrumentation this is completely invisible at any log
level. There's no log messages, there's no wait events, nothing.
ISTM, we should not have any loops around pg_usleep(). And shorter term, we
shouldn't have any loops around pg_usleep() that don't emit log messages / set
wait events. Therefore I propose that we "prohibit" such loops without at
least a DEBUG2 elog() or so. It's just too hard to debug.
The reason for the sync queue filling up in 027_stream_regress.pl is actually
fairly simple:
1) The test runs with shared_buffers = 1MB, leading to a small sync queue of
128 entries.
2) CheckpointWriteDelay() does pg_usleep(100000L)
ForwardSyncRequest() wakes up the checkpointer using SetLatch() if the sync
queue is more than half full.
But at least on linux and freebsd that doesn't actually interrupt pg_usleep()
anymore (due to using signalfd / kqueue rather than a signal handler). And on
all platforms the signal might arrive just before the pg_usleep() rather than
during, also not causing usleep to be interrupted.
If I shorten the sleep in CheckpointWriteDelay() the problem goes away. This
actually reduces the time for a single run of 027_stream_regress.pl on my
workstation noticably. With default sleep time it's ~32s, with shortened time
it's ~27s.
I suspect we need to do something about this concrete problem for 14 and
master, because it's certainly worse than before on linux / freebsd.
I suspect the easiest is to just convert that usleep to a WaitLatch(). That'd
require adding a new enum value to WaitEventTimeout in 14. Which probably is
fine?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Chapman Flack | 2022-02-26 22:03:04 | Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file |
Previous Message | Greg Stark | 2022-02-26 21:12:27 | Re: Commitfest manager for 2022-03 |