From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com> |
Subject: | Re: Logical replication timeout problem |
Date: | 2022-04-19 01:32:07 |
Message-ID: | CAD21AoCLaC-Dj=dcz5hQqcxpzi7h_eDsV5uc2156LkrKa6mLQw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Apr 18, 2022 at 3:16 PM wangw(dot)fnst(at)fujitsu(dot)com
<wangw(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Mon, Apr 18, 2022 at 00:35 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > On Mon, Apr 18, 2022 at 1:01 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Apr 14, 2022 at 5:50 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> > wrote:
> > > >
> > > > On Wed, Apr 13, 2022 at 7:45 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > wrote:
> > > > >
> > > > > On Mon, Apr 11, 2022 at 12:09 PM wangw(dot)fnst(at)fujitsu(dot)com
> > > > > <wangw(dot)fnst(at)fujitsu(dot)com> wrote:
> > > > > >
> > > > > > So I skip tracking lag during a transaction just like the current HEAD.
> > > > > > Attach the new patch.
> > > > > >
> > > > >
> > > > > Thanks, please find the updated patch where I have slightly
> > > > > modified the comments.
> > > > >
> > > > > Sawada-San, Euler, do you have any opinion on this approach? I
> > > > > personally still prefer the approach implemented in v10 [1]
> > > > > especially due to the latest finding by Wang-San that we can't
> > > > > update the lag-tracker apart from when it is invoked at the transaction end.
> > > > > However, I am fine if we like this approach more.
> > > >
> > > > Thank you for updating the patch.
> > > >
> > > > The current patch looks much better than v10 which requires to call
> > > > to
> > > > update_progress() every path.
> > > >
> > > > Regarding v15 patch, I'm concerned a bit that the new function name,
> > > > update_progress(), is too generic. How about
> > > > update_replation_progress() or something more specific name?
> > > >
> > >
> > > Do you intend to say update_replication_progress()? The word
> > > 'replation' doesn't make sense to me. I am fine with this suggestion.
> >
> > Yeah, that was a typo. I meant update_replication_progress().
> Thanks for your comments.
>
> > > > Regarding v15 patch, I'm concerned a bit that the new function name,
> > > > update_progress(), is too generic. How about
> > > > update_replation_progress() or something more specific name?
> Improve as suggested. Change the name from update_progress to
> update_replication_progress.
>
> > > > ---
> > > > + if (end_xact)
> > > > + {
> > > > + /* Update progress tracking at xact end. */
> > > > + OutputPluginUpdateProgress(ctx, skipped_xact, end_xact);
> > > > + changes_count = 0;
> > > > + return;
> > > > + }
> > > > +
> > > > + /*
> > > > + * After continuously processing CHANGES_THRESHOLD changes,
> > > > we try to send
> > > > + * a keepalive message if required.
> > > > + *
> > > > + * We don't want to try sending a keepalive message after
> > > > processing each
> > > > + * change as that can have overhead. Testing reveals that there is no
> > > > + * noticeable overhead in doing it after continuously
> > > > processing 100 or so
> > > > + * changes.
> > > > + */
> > > > +#define CHANGES_THRESHOLD 100
> > > > + if (++changes_count >= CHANGES_THRESHOLD)
> > > > + {
> > > > + OutputPluginUpdateProgress(ctx, skipped_xact, end_xact);
> > > > + changes_count = 0;
> > > > + }
> > > >
> > > > Can we merge two if branches since we do the same things? Or did you
> > > > separate them for better readability?
> Improve as suggested. Merge two if-branches.
>
> Attach the new patch.
> 1. Rename the new function(update_progress) to update_replication_progress. [suggestion by Sawada-San]
> 2. Merge two if-branches in new function update_replication_progress. [suggestion by Sawada-San.]
> 3. Improve comments to make them clear. [suggestions by Euler-San.]
Thank you for updating the patch.
+ * For a large transaction, if we don't send any change to the downstream for a
+ * long time(exceeds the wal_receiver_timeout of standby) then it can timeout.
+ * This can happen when all or most of the changes are either not published or
+ * got filtered out.
+ */
+ if(end_xact || ++changes_count >= CHANGES_THRESHOLD)
+ {
We need a whitespace before '(' at above two places. The rest looks good to me.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2022-04-19 01:45:15 | Re: BufferAlloc: don't take two simultaneous locks |
Previous Message | Masahiko Sawada | 2022-04-19 01:27:40 | Re: Column Filtering in Logical Replication |