Change log level for notifying hot standby is waiting non-overflowed snapshot

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Change log level for notifying hot standby is waiting non-overflowed snapshot
Date: 2025-02-03 13:35:20
Message-ID: 02db8cd8e1f527a8b999b94a4bee3165@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

When a hot standby is restarted in a state where subtransactions have
overflowed, it may become inaccessible:

$ psql: error: connection to server at "localhost" (::1), port 5433
failed: FATAL: the database system is not yet accepting connections
DETAIL: Consistent recovery state has not been yet reached.

However, the log message that indicates the cause of this issue seems to
be only output at the DEBUG1 level:

elog(DEBUG1,
"recovery snapshot waiting for non-overflowed snapshot or "
"until oldest active xid on standby is at least %u (now %u)",
standbySnapshotPendingXmin,
running->oldestRunningXid);

I believe this message would be useful not only for developers but also
for users.
How about changing the log level from DEBUG1 to NOTICE or else?

Background:
One of our customers recently encountered an issue where the hot standby
became inaccessible after a restart.
The issue resolved itself after some time and I suspect it was caused by
a subtransaction overflow.
If the log level had been higher one, it would have been easier to
diagnose the problem.
..Even if it was a NOTICE, it may be difficult to notice the cause if
the log_min_message is set to default WARNING, but well, it seems a
higher log level is better than DEBUG1.

I would appreciate your thoughts.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachment Content-Type Size
v1-0001-Change-loglevel-for-waiting-non-overflowed-snapshot.patch text/x-diff 936 bytes

Browse pgsql-hackers by date

  From Date Subject
Next Message Tender Wang 2025-02-03 14:28:59 Re: Unsafe access BufferDescriptors array in BufferGetLSNAtomic()
Previous Message Antonin Houska 2025-02-03 13:07:55 Re: why there is not VACUUM FULL CONCURRENTLY?