RE: Logical replication timeout problem

From: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: RE: Logical replication timeout problem
Date: 2022-03-08 01:25:13
Message-ID: OS3PR01MB6275CA9B7AA7FAB71D65055F9E099@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 4, 2022 at 4:26 PM Kuroda, Hayato/黒田 隼人 <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
Thanks for your test and comments.

> Some codes were added in ReorderBufferProcessTXN() for treating DDL,
> but I doubted updating accept_writes is needed.
> IMU, the parameter is read by OutputPluginPrepareWrite() in order align
> messages.
> They should have a header - like 'w' - before their body. But here only a
> keepalive message is sent,
> no meaningful changes, so I think it might be not needed.
> I commented out the line and tested like you did [1], and no timeout and errors
> were found.
> Do you have any reasons for that?
>
> https://www.postgresql.org/message-
> id/OS3PR01MB6275A95FD44DC6C46058EA389E3B9%40OS3PR01MB6275.jpnprd0
> 1.prod.outlook.com
Yes, you are right. We should not set accept_writes to true here.
And I looked into the function WalSndUpdateProgress. I found function
WalSndUpdateProgress try to record the time of some message(by function
LagTrackerWrite) sent to subscriber, such as in function pgoutput_commit_txn.
Then, when publisher receives the reply message from the subscriber(function
ProcessStandbyReplyMessage), publisher invokes LagTrackerRead to calculate the
delay time(refer to view pg_stat_replication).
Referring to the purpose of LagTrackerWrite, I think it is no need to log time
when sending keepalive messages here.
So when the parameter send_keep_alive of function WalSndUpdateProgress is true,
skip the recording time.

> I'm also happy if you give the version number :-).
Introduce version information, starting from version 1.

Attach the new patch.
1. Fix wrong variable setting and skip unnecessary time records.[suggestion by Kuroda-San and me.]
2. Introduce version information.[suggestion by Peter, Kuroda-San]

Regards,
Wang wei

Attachment Content-Type Size
v1-0001-Fix-the-timeout-of-subscriber-in-long-transaction.patch application/octet-stream 13.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-03-08 01:28:46 Re: pg_tablespace_location() failure with allow_in_place_tablespaces
Previous Message Michael Paquier 2022-03-08 01:17:03 Re: make tuplestore helper function