Quick Links

Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

From:	Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	Nathan Bossart <nathandbossart(at)gmail(dot)com>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
Date:	2022-03-14 05:24:56
Message-ID:	CALj2ACWB8to-Y950t+tWE0FKSmvVKq2iOj=TgR7c0So2Fyi2jQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Mar 14, 2022 at 10:45 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Sun, Mar 13, 2022 at 02:58:58PM -0700, Nathan Bossart wrote:
> > On Sun, Mar 13, 2022 at 01:54:10PM +0530, Bharath Rupireddy wrote:
> >> Another thing I added in v2 is to not emit snapshot and mapping files
> >> stats in case of restartpoint as logical decoding isn't supported on
> >> standbys, so it doesn't make sense to emit the stats there and cause
> >> server log to grow unnecessarily. Having said that, I added a note
> >> there to change it whenever logical decoding on standbys is supported.
> >
> > I think we actually do want to include this information for restartpoints
> > since some files might be left over from the snapshot that was used to
> > create the standby. Also, presumably these functions could do some work
> > during recovery on a primary.
>
> Yes, I would agree that consistency makes sense here, and it is not
> complicated to add the code to support this code path anyway. There
> is a risk that folks working on logical decoding on standbys overse
> this code path, instead.

I agree to be consistent and emit the message even in case of restartpoint.

> > Another problem I see is that this patch depends on the return value of the
> > lstat() calls that we are trying to remove in 0001 from another thread [0].
> > I think the size of the removed/sync'd files is somewhat useful for
> > understanding disk space usage, but I suspect the time spent performing
> > these tasks is more closely related to the number of files. Do you think
> > reporting the sizes is worth the extra system call?
>
> We are not talking about files that are large either, are we?
>
> Another thing I am a bit annoyed with in this patch is the fact that
> the size of the ereport() call is doubled. The LOG currently
> generated is already bloated, and this does not arrange things.

Yes, this is a concern. Also, when there were no logical replication
slots on a plain server or the server removed or cleaned up all the
snapshot/mappings files, why would anyone want to have these messages
with all 0s in the server log?

Here's what I'm thinking:

Leave the existing "checkpoint/restartpoint complete" messages intact,
add the following in LogCheckpointEnd:

if (CheckpointStats.repl_map_files_rmvd_cnt ||
CheckpointStats.repl_map_files_syncd_cnt ||
CheckpointStats.repl_snap_files_rmvd_cnt)
{
ereport(LOG,
(errmsg("snapbuild snapshot file(s) removed="
UINT64_FORMAT ", size=%zu bytes, time=%ld.%03d s, cutoff LSN=%X/%X; "
"logical rewrite mapping file(s) removed="
UINT64_FORMAT ", size=%zu bytes, synced=" UINT64_FORMAT ", size=%zu
bytes, time=%ld.%03d s, cutoff LSN=%X/%X",
CheckpointStats.repl_snap_files_rmvd_cnt,
CheckpointStats.repl_snap_files_rmvd_sz,
repl_snap_msecs / 1000, (int) (repl_snap_msecs % 1000),
LSN_FORMAT_ARGS(CheckpointStats.repl_snap_cutoff_lsn),
CheckpointStats.repl_map_files_rmvd_cnt,
CheckpointStats.repl_map_files_rmvd_sz,
CheckpointStats.repl_map_files_syncd_cnt,
CheckpointStats.repl_map_files_syncd_sz,
repl_map_msecs / 1000, (int) (repl_map_msecs % 1000),
LSN_FORMAT_ARGS(CheckpointStats.repl_map_cutoff_lsn))));
}

Thoughts?

Regards,
Bharath Rupireddy.

In response to

Re: add checkpoint stats of snapshot and mapping files of pg_logical dir at 2022-03-14 05:15:40 from Michael Paquier

Responses

Re: add checkpoint stats of snapshot and mapping files of pg_logical dir at 2022-03-14 08:03:19 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kyotaro Horiguchi	2022-03-14 05:31:12	Re: BufferAlloc: don't take two simultaneous locks
Previous Message	Thomas Munro	2022-03-14 05:15:59	Re: WIP: WAL prefetch (another approach)