Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Henry Hinze <henry(dot)hinze(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Date: 2020-11-23 05:21:40
Message-ID: CAFiTN-u92BnFAU3gd4ZsOOQTiLSktyNaRU--R=g+t_fA4Dyq7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Nov 21, 2020 at 12:23 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Nov 20, 2020 at 11:36 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 20, 2020 at 11:22 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Nov 20, 2020 at 11:17 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > >
> > > > And what about apply_handle_stream_abort? And, I guess we need to
> > > > avoid other related things like update of
> > > > replorigin_session_origin_lsn, replorigin_session_origin_timestamp,
> > > > etc. in apply_handle_stream_commit() as we are apply_handle_commit().
> > >
> > > Yes, we need to change these as well. I have tested using the POC
> > > patch and working fine. I will send the patch after some more
> > > testing.
> > > >
> > > > I think it is difficult to have a reliable test case for this but feel
> > > > free to propose if you have any ideas on the same.
> > >
> > > I am not sure how to write an automated test case for this.
> >
> > Here is the patch.
> >
>
> Few comments:
> ==============
> 1.
> + /* The synchronization worker runs in single transaction. */
> + if (!am_tablesync_worker())
> + {
> + /*
> + * Update origin state so we can restart streaming from correct position
> + * in case of crash.
> + */
> + replorigin_session_origin_lsn = commit_data.end_lsn;
> + replorigin_session_origin_timestamp = commit_data.committime;
> + CommitTransactionCommand();
> + pgstat_report_stat(false);
> + store_flush_position(commit_data.end_lsn);
> + }
> + else
> + {
> + /* Process any invalidation messages that might have accumulated. */
> + AcceptInvalidationMessages();
> + maybe_reread_subscription();
> + }
>
> Here, why the check IsTransactionState() is not there similar to
> apply_handle_commit? Also, this code looks the same as in
> apply_handle_commit(), can we move this into a common function say
> apply_handle_commit_internal or something like that? If we decide to
> do so then we can move a few more things as mentioned below in the
> common function:

Moved to the common function as suggested.

> apply_handle_commit()
> {
> ..
> in_remote_transaction = false;
>
> /* Process any tables that are being synchronized in parallel. */
> process_syncing_tables(commit_data.end_lsn);
> ..
> }

This as well.

> 2.
> @@ -902,7 +906,9 @@ apply_handle_stream_abort(StringInfo s)
> {
> /* Cleanup the subxact info */
> cleanup_subxact_info();
> - CommitTransactionCommand();
> +
> + if (!am_tablesync_worker())
> + CommitTransactionCommand();
>
> Here, also you can add a comment: "/* The synchronization worker runs
> in single transaction. */"
>

Done

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v2-0001-Bug-fix-in-handling-streaming-transaction-in-tabl.patch text/x-patch 4.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2020-11-23 07:05:51 Re: lc_collate mess
Previous Message Alvaro Herrera 2020-11-22 15:47:27 Re: BUG #16732: pg_dump creates broken backups