Quick Links

RE: Logical replication timeout problem

From:	"wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>
Cc:	"tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	RE: Logical replication timeout problem
Date:	2022-01-26 03:37:28
Message-ID:	OS3PR01MB6275DFFDAC7A59FA148931529E209@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Jan 22, 2022 at 7:12 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Now, one idea to solve this problem could be that whenever we skip
> sending any change we do try to update the plugin progress via
> OutputPluginUpdateProgress(for walsender, it will invoke
> WalSndUpdateProgress), and there it tries to process replies and send
> keep_alive if necessary as we do when we send some data via
> OutputPluginWrite(for walsender, it will invoke WalSndWriteData). I
> don't know whether it is a good idea to invoke such a mechanism for
> every change we skip to send or we should do it after we skip sending
> some threshold of continuous changes. I think later would be
> preferred. Also, we might want to introduce a new parameter
> send_keep_alive to this API so that there is flexibility to invoke
> this mechanism as we don't need to invoke it while we are actually
> sending data and before that, we just update the progress via this
> API.

I tried out the patch according to your advice.
I found if I invoke ProcessRepliesIfAny and WalSndKeepaliveIfNecessary in
function OutputPluginUpdateProgress, the running time of the newly added
function OutputPluginUpdateProgress invoked in pgoutput_change brings notable
overhead:
--11.34%--pgoutput_change
|
|--8.94%--OutputPluginUpdateProgress
| |
| --8.70%--WalSndUpdateProgress
| |
| |--7.44%--ProcessRepliesIfAny

So I tried another way of sending keepalive message to the standby machine
based on the timeout without asking for a reply(see attachment), the running
time of the newly added function OutputPluginUpdateProgress invoked in
pgoutput_change also brings slight overhead:
--3.63%--pgoutput_change
|
|--1.40%--get_rel_sync_entry
| |
| --1.14%--hash_search
|
--1.08%--OutputPluginUpdateProgress
|
--0.85%--WalSndUpdateProgress

Based on above, I think the second idea that sending some threshold of
continuous changes might be better, I will do some research about this
approach.

Regards,
Wang wei

Attachment	Content-Type	Size
0001-Fix-the-timeout-of-subscriber-in-long-transactions.patch	application/octet-stream	7.8 KB

In response to

Re: Logical replication timeout problem at 2022-01-22 11:11:42 from Amit Kapila

Responses

Re: Logical replication timeout problem at 2022-01-28 11:35:30 from Fabrice Chapuis
RE: Logical replication timeout problem at 2022-02-08 02:59:34 from wangw.fnst@fujitsu.com

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Justin Pryzby	2022-01-26 03:44:26	Re: GUC flags
Previous Message	Masahiko Sawada	2022-01-26 03:25:21	Re: Skipping logical replication transactions on subscriber side