Re: Is it correct to update db state in control file as "shutting down" during end-of-recovery checkpoint?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bharath(dot)rupireddyforpostgres(at)gmail(dot)com
Cc: bossartn(at)amazon(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Is it correct to update db state in control file as "shutting down" during end-of-recovery checkpoint?
Date: 2021-12-08 07:35:15
Message-ID: 20211208.163515.1435650949514702327.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 8 Dec 2021 11:47:30 +0530, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote in
> On Wed, Dec 8, 2021 at 10:59 AM Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:
> > >> Another option we might want to consider is to just skip updating the
> > >> state entirely for end-of-recovery checkpoints. The state would
> > >> instead go straight from DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. I
> > >> don't know if it's crucial to have a dedicated control file state for
> > >> end-of-recovery checkpoints.

FWIW I find it simple but sufficient since I regarded the
end-of-recovery checkpoint as a part of recovery. In that case what
is strange here is only that the state transition passes the
DB_SHUTDOWN(ING/ED) states.

On the other hand, when a server is going to shutdown, the state stays
at DB_IN_PRODUCTION if there are clinging clients even if the shutdown
procedure has been already started and no new clients can connect to
the server. There's no reason we need to be so particular about states
for recovery-end.

> > > Please note that end-of-recovery can take a while in production
> > > systems (we have observed such things working with our customers) and
> > > anything can happen during that period of time. The end-of-recovery
> > > checkpoint is not something that gets finished momentarily. Therefore,
> > > having a separate db state in the control file is useful.
> >
> > Is there some useful distinction between the states for users? ISTM
> > that users will be waiting either way, and I don't know that an extra
> > control file state will help all that much. The main reason I bring
> > up this option is that the list of states is pretty short and appears
> > to be intended to indicate the high-level status of the server. Most
> > of the states are over 20 years old, and the newest one is over 10
> > years old, so I don't think new states can be added willy-nilly.
>
> Firstly, updating the control file with "DB_SHUTDOWNING" and
> "DB_SHUTDOWNED" for end-of-recovery checkpoint is wrong. I don't think
> having DB_IN_CRASH_RECOVERY for end-of-recovery checkpoint is a great
> idea. We have a checkpoint (which most of the time takes a while) in
> between the states DB_IN_CRASH_RECOVERY to DB_IN_PRODUCTION. The state
> DB_IN_END_OF_RECOVERY_CHECKPOINT added by the v1 patch at [1] (in this
> thread) helps users to understand and clearly distinguish what state
> the db is in.
>
> IMHO, the age of the code doesn't stop us adding/fixing/improving the code.
>
> > Of course, I could be off-base and others might agree that this new
> > state would be nice to have.
>
> Let's see what others have to say about this.

I see it a bit too complex for the advantage. When end-of-recovery
checkpoint takes so long, that state is shown in server log, which
operators would look into before the control file.

> [1] - https://www.postgresql.org/message-id/CALj2ACVn5M8xgQ3RD%3D6rSTbbXRBdBWZ%3DTTOBOY_5%2BedMCkWjHA%40mail.gmail.com

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-12-08 08:03:30 Re: Make pg_waldump report replication origin ID, LSN, and timestamp.
Previous Message Michael Paquier 2021-12-08 07:31:42 Re: Make pg_waldump report replication origin ID, LSN, and timestamp.