From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | emre(at)hasegeli(dot)com, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Unnecessary confirm work on logical replication |
Date: | 2023-11-03 21:34:43 |
Message-ID: | bccca330-6958-d01f-09e1-7042f2dcf895@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 4/11/23 14:58, Emre Hasegeli wrote:
>> In fact, the WAL sender always starts reading WAL from restart_lsn,
>> which in turn is always <= confirmed_flush_lsn. While reading WAL, WAL
>> sender may read XLOG_RUNNING_XACTS WAL record with lsn <=
>> confirmed_flush_lsn. While processing XLOG_RUNNING_XACTS record it may
>> update its restart_lsn and catalog_xmin with current_lsn = lsn fo
>> XLOG_RUNNING_XACTS record. In this situation current_lsn <=
>> confirmed_flush_lsn.
>
> This can only happen when the WAL sender is restarted. However in
> this case, the restart_lsn and catalog_xmin should have already been
> persisted by the previous run of the WAL sender.
>
> I still doubt these calls are necessary. I think there is a
> complicated chicken and egg problem here. Here is my logic:
>
> 1) LogicalConfirmReceivedLocation() is called explicitly when
> confirmed_flush is sent by the replication client.
>
> 2) LogicalConfirmReceivedLocation() is the only place that updates
> confirmed_flush.
>
> 3) The replication client can only send a confirmed_flush for a
> current_lsn it has already received.
>
> 4) These two functions have already run for any current_lsn the
> replication client has received.
>
> 5) These two functions call LogicalConfirmReceivedLocation() only if
> current_lsn <= confirmed_flush.
>
> Thank you for your patience.
>
Hi Emre,
I was going through my TODO list of messages, and I stumbled on this
thread from a couple months back. Do you still think this is something
we should do?
I see there was some discussion about whether this update is needed, or
whether current_lsn can ever be <= confirmed_flush_lsn. I think you may
be right this can't happen, but I wonder if we could verify that by an
assert in a convenient place (instead of just removing the update).
Also, did you try to quantify how expensive this is? The update seems
very cheap, but I guess you just noticed by eye-balling the code, which
is fine ofc. Even if it's cheap/not noticeable, it still may be worth
removing so as not to confuse people improving the code in the future.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-11-03 21:56:42 | Draft back-branch release notes are up |
Previous Message | Andres Freund | 2023-11-03 21:16:01 | Re: meson documentation build open issues |