Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Henry Hinze <henry(dot)hinze(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Date: 2020-11-26 02:45:29
Message-ID: CAA4eK1+-GSAYd4VUXwgK-w+bxehPWZTe5dmpvO011Z+k8Lj7kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Nov 26, 2020 at 6:47 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Wed, Nov 25, 2020 at 7:53 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > I thought there should be a couple small
> > > changes made for this patch, as follows.
> > >
> > > (1) process_sync_tables should be last
> > >
> > > IMO the process_syncing_tables should be always the *last* thing any
> > > apply handler does, because it seems a bad idea for the worker to
> > > change state (e.g. to say it is SUBREL_STATE_SYNCDONE) if in fact
> > > there is still a little bit more processing remaining.
> > >
> >
> > Hmm, what makes you think it is a bad idea, is there any comment which
> > makes you feel so? As far as I understand, the only thing tablesync
> > worker needs to ensure before reaching that state is it should apply
> > till the required lsn which is done by the time it is called in the
> > patch. I think doing it sooner is better because it will let apply
> > worker start its work. In any case, this is not related to this
> > bug-fix patch, so, if you want to build a case for doing it later then
> > we can discuss it separately.
>
> Yes, this review issue is independent of the initial streaming bug the
> patch is solving. But by refactoring this exact line of code into a
> new function apply_handle_commit_internal() I felt the patch is kind
> of now taking ownership of it - so it is not *really* unrelated to
> this patch anymore...
>
> (However if you prefer to treat this review item as a separate issue
> it is fine by me).
>

Let's try to see if we can have a common understanding before I push
this. I don't think this is an issue neither did you presented any
theory or test why you believe it is an issue.

> AFAIK the process_syncing_tables function acts like the main handshake
> mechanism between the "tablesync" and "apply" workers.
> e.g. From "tablesync" worker POV - am I finished? Should I tell the
> apply worker I am DONE? Is it time to exit?
> e.g. From "apply" worker POV - have the tablesyncs caught up yet? Can I proceed?
>

Here, I think we need to have a common understanding of finished and
or DONE. It is not defined by where that function is called but when
it is appropriate to call it. I have tried to explain in my previous
response but not sure you have read it carefully.

> I do think these handshake functions ought to be consistently located
> (e.g always kept last in the apply handler functions where possible).
>

There is no harm in doing so but you haven't presented any theory
which makes me feel that is correct or more appropriate.

> It is true, a slightly different placement (e.g. like current code)
> still passes the tests,
>

Hmm, it is not only about passing the tests. It is not always true
that if tests pass, the code is correct especially in such cases where
writing test is not feasible. We need to understand the design
principle behind doing so which I am trying to explain.

> but I see more problems than benefits to keep
> it that way. IIUC for the tablesync worker to be executing some of
> these handlers at all is a quite a rare occurrence caused by unusual
> timing.

It is due to the reason that we want tablesync worker to apply the WAL
up to LSN apply worker has read by that time so that the apply worker
can continue from there.

> I think putting the process_syncing_tables() call earlier with
> the objective to allow apply worker to proceed a few ms earlier (for
> the already rare case) is not worth it.
>

I think that is not the primary objective. It is also that having it
in common function as the patch is doing allows us to not accidentally
remove it or forget calling it from some other similar place. OTOH, I
don't see any clear advantage of moving out of the common function.

> OTOH, putting the handshake function consistently last does have benefits:
> - the consistent code is easier to read
> - none of the tablesync timing stuff is very easy to test (certainly
> not with automated regression tests). Keeping code consistent means
> less risk of introducing some unforeseen/untestable issue
>
> - that stream_cleanup_files(...) called by
> apply_handle_stream_commit() is supposed to be cleaning up file
> resources. IIUC by allowing the tablesync worker to call
> process_syncing_tables() before this cleanup it means the tablesync
> worker may decide that it is SUBREL_STATE_SYNCDONE and then just exit
> the process! So in this scenario that file resource cleanup will never
> get a chance to happen at all. Isn't that just a plain bug?
>

No, we clean up the files on process exit (via
SharedFileSetDeleteOnProcExit). If we don't have such a mechism, it
would be a problem in more number of cases then you describe. See
comments atop src/backend/replication/logical/worker.c (especially the
STREAMED TRANSACTIONS section).

> >
> > >
> > > (2) misleading comment
> > >
> > > /*
> > > * Start a transaction on stream start, this transaction will be committed
> > > * on the stream stop. We need the transaction for handling the buffile,
> > > * used for serializing the streaming data and subxact info.
> > > */
> > >
> > > That above comment (in the apply_handle_stream_start function) may now
> > > be inaccurate/misleading for the case of tablesync worker, because I
> > > think for tablesync you are explicitly avoiding the tx commit within
> > > the apply_handle_stream_stop().
> > >
> >
> > Okay, I think we can slightly adjust the comment as: "Start a
> > transaction on stream start, this transaction will be committed on the
> > stream stop unless it is a tablesync worker in which case it will be
> > committed after processing all the messages. We need the transaction
> > for handling the buffile, used for serializing the streaming data and
> > subxact info.".
>
> The reworded comment looks ok now.
>

Okay, will change once we agree on previous point.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2020-11-26 04:49:05 Re: BUG #16745: delete does not prune partitions on declarative partitioned table
Previous Message Zhiyu ZY13 Xu 2020-11-26 01:35:20 pgadmin--pgagent---the process hang by unknow reasons