From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Subject: | Re: Various bugs if segment containing redo pointer does not exist |
Date: | 2023-10-24 00:43:52 |
Message-ID: | 20231024004352.penbryfk3fvisocw@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-10-23 16:21:45 -0700, Andres Freund wrote:
> 1) For some reason I haven't yet debugged, the ReadRecord(PANIC) in
> PerformWalRecovery() doesn't PANIC, but instead just returns NULL
>
> We *do* output a DEBUG message, but well, that's insufficient.
The debug is from this backtrace:
#0 XLogFileReadAnyTLI (segno=6476, emode=13, source=XLOG_FROM_PG_WAL) at /home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:4291
#1 0x000055d7b3949db0 in WaitForWALToBecomeAvailable (RecPtr=108649259008, randAccess=true, fetching_ckpt=false, tliRecPtr=108665421680, replayTLI=1,
replayLSN=108665421680, nonblocking=false) at /home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:3697
#2 0x000055d7b39494ff in XLogPageRead (xlogreader=0x55d7b472c470, targetPagePtr=108649250816, reqLen=8192, targetRecPtr=108665421680,
readBuf=0x55d7b47ba5d8 "\024\321\005") at /home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:3278
#3 0x000055d7b3941bb1 in ReadPageInternal (state=0x55d7b472c470, pageptr=108665413632, reqLen=8072)
at /home/andres/src/postgresql/src/backend/access/transam/xlogreader.c:1014
#4 0x000055d7b3940f43 in XLogDecodeNextRecord (state=0x55d7b472c470, nonblocking=false)
at /home/andres/src/postgresql/src/backend/access/transam/xlogreader.c:571
#5 0x000055d7b3941a41 in XLogReadAhead (state=0x55d7b472c470, nonblocking=false) at /home/andres/src/postgresql/src/backend/access/transam/xlogreader.c:947
#6 0x000055d7b393f5fa in XLogPrefetcherNextBlock (pgsr_private=94384934340072, lsn=0x55d7b47cfeb8)
at /home/andres/src/postgresql/src/backend/access/transam/xlogprefetcher.c:496
#7 0x000055d7b393efcd in lrq_prefetch (lrq=0x55d7b47cfe88) at /home/andres/src/postgresql/src/backend/access/transam/xlogprefetcher.c:256
#8 0x000055d7b393f190 in lrq_complete_lsn (lrq=0x55d7b47cfe88, lsn=0) at /home/andres/src/postgresql/src/backend/access/transam/xlogprefetcher.c:294
#9 0x000055d7b39401ba in XLogPrefetcherReadRecord (prefetcher=0x55d7b47bc5e8, errmsg=0x7ffc23505920)
at /home/andres/src/postgresql/src/backend/access/transam/xlogprefetcher.c:1041
#10 0x000055d7b3948ff8 in ReadRecord (xlogprefetcher=0x55d7b47bc5e8, emode=23, fetching_ckpt=false, replayTLI=1)
at /home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:3078
#11 0x000055d7b3946749 in PerformWalRecovery () at /home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:1640
The source of the emode=13=DEBUG2 is that that's hardcoded in
WaitForWALToBecomeAvailable(). I guess the error ought to come from
XLogPageRead(), but all that happens is this:
case XLREAD_FAIL:
if (readFile >= 0)
close(readFile);
readFile = -1;
readLen = 0;
readSource = XLOG_FROM_ANY;
return XLREAD_FAIL;
which *does* error out for some other failures:
errno = save_errno;
ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen),
(errcode_for_file_access(),
errmsg("could not read from WAL segment %s, LSN %X/%X, offset %u: %m",
fname, LSN_FORMAT_ARGS(targetPagePtr),
readOff)));
but not for a file that couldn't be opened. Which wouldn't have to be due to
ENOENT, could also be EACCESS...
xlogreader has undergone a fair bit of changes in the last few releases. As of
now, I can't really understand who is responsible for reporting what kind of
error.
/*
* Data input callback
*
* This callback shall read at least reqLen valid bytes of the xlog page
* starting at targetPagePtr, and store them in readBuf. The callback
* shall return the number of bytes read (never more than XLOG_BLCKSZ), or
* -1 on failure. The callback shall sleep, if necessary, to wait for the
* requested bytes to become available. The callback will not be invoked
* again for the same page unless more than the returned number of bytes
* are needed.
*
* targetRecPtr is the position of the WAL record we're reading. Usually
* it is equal to targetPagePtr + reqLen, but sometimes xlogreader needs
* to read and verify the page or segment header, before it reads the
* actual WAL record it's interested in. In that case, targetRecPtr can
* be used to determine which timeline to read the page from.
*
* The callback shall set ->seg.ws_tli to the TLI of the file the page was
* read from.
*/
XLogPageReadCB page_read;
/*
* Callback to open the specified WAL segment for reading. ->seg.ws_file
* shall be set to the file descriptor of the opened segment. In case of
* failure, an error shall be raised by the callback and it shall not
* return.
*
* "nextSegNo" is the number of the segment to be opened.
*
* "tli_p" is an input/output argument. WALRead() uses it to pass the
* timeline in which the new segment should be found, but the callback can
* use it to return the TLI that it actually opened.
*/
WALSegmentOpenCB segment_open;
My reading of this is that page_read isn't ever supposed to error out - yet we
do it all over (e.g. WALDumpReadPage(), XLogPageRead()) and that segment_open
is supposed to error out - but we don't even use segment_open for
xlogrecovery.c. Who is supposed to error out then? And why is it ok for the
read callbacks to error out directly, if they're supposed to return -1 in case
of failure?
Somewhat of an aside: It also seems "The callback shall sleep, if necessary,
to wait for the requested bytes to become available." is outdated, given that
we *explicitly* don't do so in some cases and support that via
XLREAD_WOULDBLOCK?
I dug through recent changes, expecting to find the problem. But uh, no. I
reproduced this in 9.4, and I think the behaviour might have been introduced
in 9.3 (didn't have a build of that around, hence didn't test that), as part
of:
commit abf5c5c9a4f
Author: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
Date: 2013-02-22 11:43:04 +0200
If recovery.conf is created after "pg_ctl stop -m i", do crash recovery.
Before that, XLogPageRead() called XLogFileReadAnyTLI() with the "incoming"
emode, when in crash recovery -> PANIC in this case. After that commit, the
call to XLogFileReadAnyTLI() was moved to WaitForWALToBecomeAvailable(), but
the emode was changed to unconditionally be DEBUG2.
/*
* Try to restore the file from archive, or read an existing
* file from pg_xlog.
*/
readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2,
currentSource == XLOG_FROM_ARCHIVE ? XLOG_FROM_ANY :
currentSource);
In 9.4 the server actually came up ok after encountering the problem
(destroying data in the process!), not sure where we started to bogusly
initialize shared memory...
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-10-24 00:44:01 | Re: Show version of OpenSSL in ./configure output |
Previous Message | Erik Wienhold | 2023-10-24 00:20:26 | Re: Patch: Improve Boolean Predicate JSON Path Docs |