From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | smithpb2250(at)gmail(dot)com |
Cc: | exclusion(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: 001_rep_changes.pl fails due to publisher stuck on shutdown |
Date: | 2024-06-06 06:19:20 |
Message-ID: | 20240606.151920.427007697352129737.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Thu, 6 Jun 2024 12:49:45 +1000, Peter Smith <smithpb2250(at)gmail(dot)com> wrote in
> Hi, I have reproduced this multiple times now.
>
> I confirmed the initial post/steps from Alexander. i.e. The test
> script provided [1] gets itself into a state where function
> ReadPageInternal (called by XLogDecodeNextRecord and commented "Wait
> for the next page to become available") constantly returns
> XLREAD_FAIL. Ultimately the test times out because WalSndLoop() loops
> forever, since it never calls WalSndDone() to exit the walsender
> process.
Thanks for the repro; I believe I understand what's happening here.
During server shutdown, the latter half of the last continuation
record may fail to be flushed. This is similar to what is described in
the commit message of commit ff9f111bce. While shutting down,
WalSndLoop() waits for XLogSendLogical() to consume WAL up to
flushPtr, but in this case, the last record cannot complete without
the continuation part starting from flushPtr, which is
missing. However, in such cases, xlogreader.missingContrecPtr is set
to the beginning of the missing part, but something similar to
So, I believe the attached small patch fixes the behavior. I haven't
come up with a good test script for this issue. Something like
026_overwrite_contrecord.pl might work, but this situation seems a bit
more complex than what it handles.
Versions back to 10 should suffer from the same issue and the same
patch will be applicable without significant changes.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-infinite-loop-in-walsender-during-publisher-shut.patch | text/x-patch | 1.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2024-06-06 06:47:52 | Re: Logical Replication of sequences |
Previous Message | Bertrand Drouvot | 2024-06-06 05:56:14 | Re: Avoid orphaned objects dependencies, take 3 |