From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | Cary Huang <cary(dot)huang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add last failed connection error message to pg_stat_wal_receiver |
Date: | 2022-08-18 03:31:00 |
Message-ID: | Yv2ydB/7bLUEY/tr@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Aug 04, 2022 at 03:27:11PM +0530, Bharath Rupireddy wrote:
> Good point. The walreceiver can exit for any reason. We can either 1)
> store for all the error messages or 2) think of using sigsetjmp but
> that only catches the ERROR kinds, leaving FATAL and PANIC messages.
> The option (1) is simple but there are problems - we may miss storing
> future error messages, good commenting and reviewing may help here and
> all the error messages now need to be stored in string, which is
> complex. The option (2) seems reasonable but we will miss FATAL and
> PANIC messages (we have many ERRORs, 2 FATALs, 3 PANICs). Maybe a
> combination of option (1) for FATALs and PANICs, and option (2) for
> ERRORs helps.
>
> Thoughts?
PANIC is not something you'd care about as the system would go down as
and shared memory would be reset (right?) even if restart_on_crash is
enabled. Perhaps it would help here to use something like a macro to
catch and save the error, in a style similar to what's in hba.c for
example, which is the closest example I can think of, even if on ERROR
we don't really care about the error string anyway as there is nothing
to report back to the SQL views used for the HBA/ident files.
FATAL may prove to be tricky though, because I'd expect the error to
be saved in shared memory in this case. This is particularly critical
as this takes the WAL receiver process down, actually.
Anyway, outside the potential scope of the proposal, there are more
things that I find strange with the code:
- Why isn't the string reset when the WAL receiver is starting up?
That surely is not OK to keep a past state not referring to what
actually happens with a receiver currently running.
- pg_stat_wal_receiver (system view) reports no rows if pid is NULL,
which would be the state stored in shared memory after a connection.
This means that one would never be able to see last_conn_error except
when calling directly the SQL function pg_stat_get_wal_receiver().
One could say that we should report a row for this view all the time,
but this creates a compatibility breakage: existing application
assuming something like (one row <=> WAL receiver running) could
break.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Andrey Borodin | 2022-08-18 03:35:15 | Re: MultiXact\SLRU buffers configuration |
Previous Message | David Rowley | 2022-08-18 03:17:33 | Re: shadow variables - pg15 edition |