From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: walsender.c fileheader comment |
Date: | 2024-07-16 20:56:11 |
Message-ID: | cdf6841c-a283-4f41-b0b0-a46819c827fa@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 6/11/24 04:35, Peter Smith wrote:
> Hi,
>
> I was reading the walsender.c fileheader comment while studying
> another thread. I think if there is logical replication in progress
> then the PROCSIG_WALSND_INIT_STOPPING handler will *always* switch to
> a "stopping" state: e.g.,
>
> /*
> * Handle PROCSIG_WALSND_INIT_STOPPING signal.
> */
> void
> HandleWalSndInitStopping(void)
> {
> Assert(am_walsender);
>
> /*
> * If replication has not yet started, die like with SIGTERM. If
> * replication is active, only set a flag and wake up the main loop. It
> * will send any outstanding WAL, wait for it to be replicated to the
> * standby, and then exit gracefully.
> */
> if (!replication_active)
> kill(MyProcPid, SIGTERM);
> else
> got_STOPPING = true;
> }
>
> ~~~
>
> But the walsender.c fileheader comment seems to be saying something
> slightly different. IIUC, some minor rewording of the comment is
> needed so it describes the code better.
>
> HEAD
> ...
> * shutdown, if logical replication is in progress all existing WAL records
> * are processed followed by a shutdown. Otherwise this causes the walsender
> * to switch to the "stopping" state. In this state, the walsender will reject
> * any further replication commands. The checkpointer begins the shutdown
> ...
>
> SUGGESTION
> .. shutdown. If logical replication is in progress, the walsender
> switches to a "stopping" state. In this state, the walsender will
> reject any further replication commands - but all existing WAL records
> are processed - followed by a shutdown.
>
> ~~~
>
> I attached a patch for the above-suggested change.
>
> Thoughts?
I did look at this, and while the explanation in the current comment may
seem a bit confusing, I'm not sure the suggested changes improve the
situation very much.
This suggests the two comments somehow disagree, but it does not say in
what exactly, so perhaps I just missed it :-(
ISTM there's a bit of confusion what is meant by "stopping" state - you
seem to be interpreting it as a general concept, where the walsender is
requested to stop (through the signal), and starts doing stuff to exit.
But the comments actually talk about WalSnd->state, where "stopping"
means it needs to be set to WALSNDSTATE_STOPPING.
And we only ever switch to that state in two places - in WalSndPhysical
and exec_replication_command. And that does not happen in regular
logical replication (which is what "logical replication is in progress"
refers to) - if you have a walsender just replicating DML, it will never
see the WALSNDSTATE_STOPPING state. It will simply do the cleanup while
still in WALSNDSTATE_STREAMING state, and then just exit.
So from this point of view, the suggestion is actually wrong.
To conclude, I think this probably makes the comments more confusing. If
we want to make it clearer, I'd probably start by clarifying what the
"stopping" state means. Also, it's a bit surprising we may not actually
go through the "stopping" state during shutdown.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Nazir Bilal Yavuz | 2024-07-16 21:11:47 | Re: PG_TEST_EXTRA and meson |
Previous Message | Tristan Partin | 2024-07-16 20:53:45 | Re: Meson far from ready on Windows |