Re: Logical replication timeout problem

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Masahiko Sawada" <sawada(dot)mshk(at)gmail(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Peter Smith" <smithpb2250(at)gmail(dot)com>, "Fabrice Chapuis" <fabrice636861(at)gmail(dot)com>, "Simon Riggs" <simon(dot)riggs(at)enterprisedb(dot)com>, "Petr Jelinek" <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Ajin Cherian" <itsajin(at)gmail(dot)com>
Subject: Re: Logical replication timeout problem
Date: 2022-04-01 02:03:03
Message-ID: 0271e197-2cfe-4627-9c11-47535cc7fdf3@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 31, 2022, at 9:24 AM, Masahiko Sawada wrote:
> The patch basically looks good to me. But the only concern to me is
> that once we get the patch committed, we will have to call
> update_progress() at all paths in callbacks that process changes.
> Which seems poor maintainability.
I didn't like the current fix for the same reason. We need a robust feedback
system for logical replication. We had this discussion in the "skip empty
transactions" thread [1].

> On the other hand, possible another solution would be to add a new
> callback that is called e.g., every 1000 changes so that walsender
> does its job such as timeout handling while processing the decoded
> data in reorderbuffer.c. The callback is set only if the walsender
> does logical decoding, otherwise NULL. With this idea, other plugins
> will also be able to benefit without changes. But I’m not really sure
> it’s a good design, and adding a new callback introduces complexity.
No new callback is required.

In the current code, each output plugin callback is responsible to call
OutputPluginUpdateProgress. It is up to the output plugin author to add calls
to this function. The lack of a call in a callback might cause issues like what
was described in the initial message.

The functions CreateInitDecodingContext and CreateDecodingContext receives the
update_progress function as a parameter. These functions are called in 2
places: (a) streaming replication protocol (CREATE_REPLICATION_SLOT) and (b)
SQL logical decoding functions (pg_logical_*_changes). Case (a) uses
WalSndUpdateProgress as a progress function. Case (b) does not have one because
it is not required -- local decoding/communication. There is no custom update
progress routine for each output plugin which leads me to the question:
couldn't we encapsulate the update progress call into the callback functions?
If so, we could have an output plugin parameter to inform which callbacks we
would like to call the update progress routine. This would simplify the code,
make it less error prone and wouldn't impose a burden on maintainability.

[1] https://www.postgresql.org/message-id/20200309183018.tzkzwu635sd366ej%40alap3.anarazel.de

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-04-01 02:05:12 Re: should vacuum's first heap pass be read-only?
Previous Message Amit Kapila 2022-04-01 02:00:35 Re: Logical replication timeout problem