From: | "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, fabriziomello(at)gmail(dot)com, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Minimal logical decoding on standbys |
Date: | 2023-01-23 11:03:35 |
Message-ID: | 9e978c6c-0a6e-9271-1203-800c17d91d10@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 1/19/23 10:43 AM, Drouvot, Bertrand wrote:
> Hi,
>
> On 1/19/23 3:46 AM, Andres Freund wrote:
>> Hi,
>>
>> I mean a logical walsender that starts on a standby and continues across
>> promotion of the standby.
>>
>
> Got it, thanks, will do.
>
While working on it, I noticed that with V41 a:
pg_recvlogical -S active_slot -P test_decoding -d postgres -f - --start
on the standby is getting:
pg_recvlogical: error: unexpected termination of replication stream: ERROR: could not find record while sending logically-decoded data: invalid record length at 0/311C438: wanted 24, got 0
pg_recvlogical: disconnected; waiting 5 seconds to try again
when the standby gets promoted (the logical decoding is able to resume correctly after the error though).
This is fixed in V42 attached (no error anymore and logical decoding through the walsender works correctly after the promotion).
The fix is in 0003 where in logical_read_xlog_page() (as compare to V41):
- We now check if RecoveryInProgress() (instead of relying on am_cascading_walsender) to check if the standby got promoted
- Based on this, the currTLI is being retrieved with GetXLogReplayRecPtr() or GetWALInsertionTimeLine() (so, with GetWALInsertionTimeLine() after promotion)
- This currTLI is being used as an argument in WALRead() (instead of state->seg.ws_tli, which anyhow sounds weird as being
compared with itself that way "tli != state->seg.ws_tli" in WALRead()). That way WALRead() discovers that the timeline changed and then opens the right WAL file.
Please find V42 attached.
I'll resume working on the TAP tests comments.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v42-0006-Doc-changes-describing-details-about-logical-dec.patch | text/plain | 2.1 KB |
v42-0005-New-TAP-test-for-logical-decoding-on-standby.patch | text/plain | 20.4 KB |
v42-0004-Fixing-Walsender-corner-case-with-logical-decodi.patch | text/plain | 7.5 KB |
v42-0003-Allow-logical-decoding-on-standby.patch | text/plain | 11.7 KB |
v42-0002-Handle-logical-slot-conflicts-on-standby.patch | text/plain | 32.4 KB |
v42-0001-Add-info-in-WAL-records-in-preparation-for-logic.patch | text/plain | 72.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-01-23 11:08:15 | Re: heapgettup refactoring |
Previous Message | Andrew Dunstan | 2023-01-23 10:56:20 | Re: run pgindent on a regular basis / scripted manner |