Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Henry Hinze <henry(dot)hinze(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Date: 2020-11-21 06:55:00
Message-ID: CAA4eK1Lu=qDLwN51kDHc5sVrRvgybj7ACk5+M3wOuW7GewZZzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Nov 20, 2020 at 11:36 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Nov 20, 2020 at 11:22 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 20, 2020 at 11:17 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > >
> > > And what about apply_handle_stream_abort? And, I guess we need to
> > > avoid other related things like update of
> > > replorigin_session_origin_lsn, replorigin_session_origin_timestamp,
> > > etc. in apply_handle_stream_commit() as we are apply_handle_commit().
> >
> > Yes, we need to change these as well. I have tested using the POC
> > patch and working fine. I will send the patch after some more
> > testing.
> > >
> > > I think it is difficult to have a reliable test case for this but feel
> > > free to propose if you have any ideas on the same.
> >
> > I am not sure how to write an automated test case for this.
>
> Here is the patch.
>

Few comments:
==============
1.
+ /* The synchronization worker runs in single transaction. */
+ if (!am_tablesync_worker())
+ {
+ /*
+ * Update origin state so we can restart streaming from correct position
+ * in case of crash.
+ */
+ replorigin_session_origin_lsn = commit_data.end_lsn;
+ replorigin_session_origin_timestamp = commit_data.committime;
+ CommitTransactionCommand();
+ pgstat_report_stat(false);
+ store_flush_position(commit_data.end_lsn);
+ }
+ else
+ {
+ /* Process any invalidation messages that might have accumulated. */
+ AcceptInvalidationMessages();
+ maybe_reread_subscription();
+ }

Here, why the check IsTransactionState() is not there similar to
apply_handle_commit? Also, this code looks the same as in
apply_handle_commit(), can we move this into a common function say
apply_handle_commit_internal or something like that? If we decide to
do so then we can move a few more things as mentioned below in the
common function:

apply_handle_commit()
{
..
in_remote_transaction = false;

/* Process any tables that are being synchronized in parallel. */
process_syncing_tables(commit_data.end_lsn);
..
}

2.
@@ -902,7 +906,9 @@ apply_handle_stream_abort(StringInfo s)
{
/* Cleanup the subxact info */
cleanup_subxact_info();
- CommitTransactionCommand();
+
+ if (!am_tablesync_worker())
+ CommitTransactionCommand();

Here, also you can add a comment: "/* The synchronization worker runs
in single transaction. */"

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2020-11-21 09:14:49 BUG #16737: error running compiled C program with connection to Postgres
Previous Message PG Bug reporting form 2020-11-21 06:06:08 BUG #16736: SCRAM authentication is not supported by this driver