Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Date: 2022-02-18 07:43:51
Message-ID: 20220218074351.wgfosjt5lql2vjbr@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Feb 18, 2022 at 12:20:26PM +0530, Nitin Jadhav wrote:
> >
> > If there's a checkpoint timed triggered and then someone calls
> > pg_start_backup() which then wait for the end of the current checkpoint
> > (possibly after changing the flags), I think the view should reflect that in
> > some way. Maybe storing an array of (pid, flags) is too much, but at least a
> > counter with the number of processes actively waiting for the end of the
> > checkpoint.
>
> Okay. I feel this can be added as additional field but it will not
> replace backend_pid field as this represents the pid of the backend
> which triggered the current checkpoint.

I don't think that's true. Requesting a checkpoint means telling the
checkpointer that it should wake up and start a checkpoint (or restore point)
if it's not already doing so, so the pid will always be the checkpointer pid.
The only exception is a standalone backend, but in that case you won't be able
to query that view anyway.

And also while looking at the patch I see there's the same problem that I
mentioned in the previous thread, which is that the effective flags can be
updated once the checkpoint started, and as-is the view won't reflect that. It
also means that you can't simply display one of wal, time or force but a
possible combination of the flags (including the one not handled in v1).

> Probably a new field named 'processes_wiating' or 'events_waiting' can be
> added for this purpose.

Maybe num_process_waiting?

> > > > > 'checkpoint or restartpoint?'
> > > >
> > > > Do you actually need to store that? Can't it be inferred from
> > > > pg_is_in_recovery()?
> > >
> > > AFAIK we cannot use pg_is_in_recovery() to predict whether it is a
> > > checkpoint or restartpoint because if the system exits from recovery
> > > mode during restartpoint then any query to pg_stat_progress_checkpoint
> > > view will return it as a checkpoint which is ideally not correct. Please
> > > correct me if I am wrong.
> >
> > Recovery ends with an end-of-recovery checkpoint that has to finish before the
> > promotion can happen, so I don't think that a restart can still be in progress
> > if pg_is_in_recovery() returns false.
>
> Probably writing of buffers or syncing files may complete before
> pg_is_in_recovery() returns false. But there are some cleanup
> operations happen as part of the checkpoint. During this scenario, we
> may get false value for pg_is_in_recovery(). Please refer following
> piece of code which is present in CreateRestartpoint().
>
> if (!RecoveryInProgress())
> replayTLI = XLogCtl->InsertTimeLineID;

Then maybe we could store the timeline rather then then kind of checkpoint?
You should still be able to compute the information while giving a bit more
information for the same memory usage.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-02-18 07:49:48 Re: improve --with-lz4 install documentation
Previous Message Mikael Kjellström 2022-02-18 07:33:38 Re: Time to drop plpython2?