Re: Report checkpoint progress in server logs

From: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Report checkpoint progress in server logs
Date: 2021-12-29 18:54:06
Message-ID: CAHg+QDerBgyBMrDFAHAqBa0QaNo6J59xjxekkWSEa6oTvb_jvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Coincidentally, I was thinking about the same yesterday after tired of
waiting for the checkpoint completion on a server.

On Wed, Dec 29, 2021 at 7:41 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Magnus Hagander <magnus(at)hagander(dot)net> writes:
> >> Therefore, reporting the checkpoint progress in the server logs, much
> >> like [1], seems to be the best way IMO.
>
> > I find progress reporting in the logfile to generally be a terrible
> > way of doing things, and the fact that we do it for the startup
> > process is/should be only because we have no other choice, not because
> > it's the right choice.
>
> I'm already pretty seriously unhappy about the log-spamming effects of
> 64da07c41 (default to log_checkpoints=on), and am willing to lay a side
> bet that that gets reverted after we have some field experience with it.
> This proposal seems far worse from that standpoint. Keep in mind that
> our out-of-the-box logging configuration still doesn't have any log
> rotation ability, which means that the noisier the server is in normal
> operation, the sooner you fill your disk.
>

Server is not open up for the queries while running the end of recovery
checkpoint and a catalog view may not help here but the process title
change or logging would be helpful in such cases. When the server is
running the recovery, anxious customers ask several times the ETA for
recovery completion, and not having visibility into these operations makes
life difficult for the DBA/operations.

>
> > I think the right choice to solve the *general* problem is the
> > mentioned pg_stat_progress_checkpoints.
>
> +1
>

+1 to this. We need at least a trace of the number of buffers to sync
(num_to_scan) before the checkpoint start, instead of just emitting the
stats at the end.

Bharat, it would be good to show the buffers synced counter and the total
buffers to sync, checkpointer pid, substep it is running, whether it is on
target for completion, checkpoint_Reason (manual/times/forced). BufferSync
has several variables tracking the sync progress locally, and we may need
some refactoring here.

>
> regards, tom lane
>
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message SATYANARAYANA NARLAPURAM 2021-12-29 19:04:12 Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes
Previous Message Justin Pryzby 2021-12-29 17:51:21 Re: Add index scan progress to pg_stat_progress_vacuum