From: | "Imseih (AWS), Sami" <simseih(at)amazon(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [BUG] Panic due to incorrect missingContrecPtr after promotion |
Date: | 2022-02-25 04:27:19 |
Message-ID: | 379F2F42-F19A-4026-AB09-42E6F9ABB5F5@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> Nice catch! However, I'm not sure I like the patch.
> * made it through and start writing after the portion that persisted.
> * (It's critical to first write an OVERWRITE_CONTRECORD message, which
> * we'll do as soon as we're open for writing new WAL.)
> + *
> + * If the last wal record is ahead of the missing contrecord, this is a
> + * recently promoted primary and we should not write an overwrite
> + * contrecord.
> Before the part, the comment follows the part shown below.
> > * Actually, if WAL ended in an incomplete record, skip the parts that
> So, actually WAL did not ended in an incomplete record. I think
> FinishWalRecover is the last place to do that. (But it could be
> earlier.)
Thanks for the feedback.
## On the primary, an OVERWRITE_CONTRECORD is written. The aborted record (abortedRecPtr) and insert position of the missing contrecord (missingContrecPtr) are set during FinishWalRecovery and after recovery but before writes are accepted, the OVERWRITE_CONTRECORD is written.
## on the primary logs
2022-02-25 02:25:16.208 UTC [7397] LOG: redo done at 0/1FFD2B8 system usage: CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s
2022-02-25 02:25:16.208 UTC [7397] DEBUG: resetting unlogged relations: cleanup 0 init 1
2022-02-25 02:25:16.209 UTC [7397] DEBUG: creating and filling new WAL file
2022-02-25 02:25:16.217 UTC [7397] DEBUG: done creating and filling new WAL file
2022-02-25 02:25:16.217 UTC [7397] DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1
2022-02-25 02:25:16.217 UTC [7397] DEBUG: MultiXact member stop limit is now 4294914944 based on MultiXact 1
2022-02-25 02:25:16.217 UTC [7397] DEBUG: OVERWRITE_CONTRECORD inserted at 0/2000028 for aborted record 0/1FFD2E0 <<------- Attached V3 of patch that adds logging which shows the overwrite record created on the primary
2022-02-25 02:25:16.217 UTC [7395] LOG: checkpoint starting: end-of-recovery immediate wait
2022-02-25 02:25:16.217 UTC [7395] DEBUG: performing replication slot checkpoint
2022-02-25 02:25:16.218 UTC [7395] DEBUG: attempting to remove WAL segments older than log file 000000000000000000000000
2022-02-25 02:25:16.218 UTC [7395] LOG: checkpoint complete: wrote 131 buffers (102.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.001 s; sync files=0, longest=0.000 s, average=0.000 s; distance=11381 kB, estimate=11381 kB
2022-02-25 02:25:16.219 UTC [7394] DEBUG: starting background worker process "logical replication launcher"
2022-02-25 02:25:16.219 UTC [7394] LOG: database system is ready to accept connections
## A downstream standby now skips over this OVERWRITE_CONTRECORD, but the value for missingContrecPtr is not invalidated afterwards.
## on the standby logs
2022-02-25 02:25:16.616 UTC [7413] DEBUG: sending hot standby feedback xmin 0 epoch 0 catalog_xmin 0 catalog_xmin_epoch 0
2022-02-25 02:25:16.616 UTC [7387] LOG: successfully skipped missing contrecord at 0/1FFD2E0, overwritten at 2022-02-25 02:25:16.2175+00
2022-02-25 02:25:16.616 UTC [7387] CONTEXT: WAL redo at 0/2000028 for XLOG/OVERWRITE_CONTRECORD: lsn 0/1FFD2E0; time 2022-02-25 02:25:16.2175+00
## After promotion, the standby attempts to write the overwrite_contrecord again using the missingContrecPtr LSN which is now in the past.
## on the standby logs
2022-02-25 02:25:16.646 UTC [7387] LOG: invalid record length at 0/201EC70: wanted 24, got 0
2022-02-25 02:25:16.646 UTC [7387] LOG: redo done at 0/201EC48 system usage: CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.55 s
2022-02-25 02:25:16.646 UTC [7387] LOG: last completed transaction was at log time 2022-02-25 02:25:16.301554+00
2022-02-25 02:25:16.646 UTC [7387] DEBUG: resetting unlogged relations: cleanup 0 init 1
2022-02-25 02:25:16.646 UTC [7387] LOG: selected new timeline ID: 2
2022-02-25 02:25:16.646 UTC [7387] DEBUG: updated min recovery point to 0/201EC70 on timeline 1
2022-02-25 02:25:16.656 UTC [7387] DEBUG: could not remove file "pg_wal/000000020000000000000002": No such file or directory
2022-02-25 02:25:16.656 UTC [7387] LOG: archive recovery complete
2022-02-25 02:25:16.656 UTC [7387] DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1
2022-02-25 02:25:16.656 UTC [7387] DEBUG: MultiXact member stop limit is now 4294914944 based on MultiXact 1
2022-02-25 02:25:16.656 UTC [7387] DEBUG: OVERWRITE_CONTRECORD inserted at 0/2000028 for aborted record 0/1FFD2E0 <<------- The same overwrite record is written on the recently promoted standby
2022-02-25 02:25:16.656 UTC [7387] DEBUG: attempting to remove WAL segments newer than log file 000000020000000000000001
2022-02-25 02:25:16.656 UTC [7387] DEBUG: recycled write-ahead log file "000000010000000000000002"
2022-02-25 02:25:16.656 UTC [7387] DEBUG: release all standby locks
2022-02-25 02:25:16.656 UTC [7385] LOG: checkpoint starting: force
2022-02-25 02:25:16.656 UTC [7385] DEBUG: performing replication slot checkpoint
2022-02-25 02:25:16.656 UTC [7385] LOG: request to flush past end of generated WAL; request 0/201EC70, current position 0/2000088
2022-02-25 02:25:16.656 UTC [7385] PANIC: xlog flush request 0/201EC70 is not satisfied --- flushed only to 0/2000088
2022-02-25 02:25:16.657 UTC [7384] LOG: checkpointer process (PID 7385) was terminated by signal 6: Aborted
The purpose of the patch is to check that if the current LSN is beyond missingContrecPtr, and to skip ( incorrectly ) writing an overwrite contrecord that was written on the old writer.
--
Sami Imseih
Amazon Web Services
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Fix-missing-continuation-record-after-standby-pro.patch | application/octet-stream | 5.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2022-02-25 04:35:49 | Re: BufferAlloc: don't take two simultaneous locks |
Previous Message | Noboru Saito | 2022-02-25 04:23:31 | Re: Separate the result of \watch for each query execution (psql) |