Re: RFC: pg_stat_logmsg

From: Joe Conway <mail(at)joeconway(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Gurjeet Singh <gurjeet(at)singh(dot)im>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: pg_stat_logmsg
Date: 2024-07-17 11:48:15
Message-ID: adb2f669-7c18-4924-a29a-d7e20e95a0f4@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/16/24 19:08, Michael Paquier wrote:
> On Wed, Jul 17, 2024 at 12:14:36AM +0200, Tomas Vondra wrote:
>> I noticed this patch hasn't moved since September 2023, so I wonder
>> what's the main blocker / what is needed to move this?
>
> + /* Location of permanent stats file (valid when database is shut down) */
> + #define PGLM_DUMP_FILE PGSTAT_STAT_PERMANENT_DIRECTORY
> "/pg_stat_logmsg.stat
>
> Perhaps this does not count as a valid reason, but does it really make
> sense to implement things this way, knowing that this could be changed
> to rely on a potential pluggable pgstats? I mean this one I've
> proposed:
> https://www.postgresql.org/message-id/Zmqm9j5EO0I4W8dx%40paquier.xyz

Yep, see my adjacent reply to Tomas.

> One potential implementation is to stick that to be fixed-numbered,
> because there is a maximum cap to the number of entries proposed by
> the patch, while keeping the whole in memory.
>
> + logmsg_store(ErrorData *edata, Size *logmsg_offset,
> + int *logmsg_len, int *gc_count)
>
> The patch shares a lot of perks with pg_stat_statements that don't
> scale well. I'm wondering if it is a good idea to duplicate these
> properties in a second, different, module, like the external file can
> could be written out on a periodic basis depending on the workload.
> I am not saying that the other thread is a magic solution for
> everything (not looked yet at how this would plug with the cap in
> entries that pg_stat_statements wants), just one option on the table.
>
>> As for the code, I wonder if the instability of line numbers could be a
>> problem - these can change (a little bit) between minor releases, so
>> after an upgrade we'll read the dump file with line numbers from the old
>> release, and then start adding entries with new line numbers. Do we need
>> to handle this in some way?
>
> Indeed. Perhaps a PostgreSQL version number assigned to each entry to
> know from which binary an entry comes from? This would cost a couple
> of extra bytes for each entry still that would be the best information
> possible to match that with a correct code tree. If it comes to that,
> even getting down to a commit SHA1 could be useful to provide the
> lowest level of granularity. Another thing would be to give up on the
> line number, stick to the uniqueness in the stats entries with the
> errcode and the file name, but that won't help for things like
> tablecmds.c.

I think including version in the key makes most sense. Also do we even
have a mechanism to grab the commit sha in running code?

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-07-17 11:55:04 Re: long-standing data loss bug in initial sync of logical replication
Previous Message Joe Conway 2024-07-17 11:43:13 Re: RFC: pg_stat_logmsg