From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | alex(at)altmetric(dot)com |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17327: Postgres server does not correctly emit error for max_slot_wal_keep_size being breached |
Date: | 2021-12-13 05:44:42 |
Message-ID: | 20211213.144442.1533332056665514669.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
At Fri, 10 Dec 2021 15:46:11 +0000, Alex Enachioaie <alex(at)altmetric(dot)com> wrote in
> So, essentially the server side log emmitted on a temporary
> replication breaching max_slot_wal_keep_size limit is only:
>
> 2021-12-03 16:21:54 UTC [29724-2647] LOG: terminating process 42601 to
> release replication slot "pg_basebackup_42601"
>
> whereas for a persistent replication slot we get an additional line
> that clearly states _why_ the replication process was terminated:
>
> 2021-12-03 00:57:16 UTC [29724-2645] LOG: terminating process 3899 to
> release replication slot "backup"
> 2021-12-03 00:57:16 UTC [29724-2646] LOG: invalidating slot "backup"
> because its restart_lsn 47198/1E000000 exceeds max_slot_wal_keep_size
>
> I'm not sure if this means that in the case of a temporary slot it
> does not get invalidated at all (I've not looked at the code), or it's
> simply that we don't emit a log message when it does because the slot
> would be discarded anyway, but such a message would be very useful for
> diagnostic purposes imo.
The "invalidating slot" message is emitted when the slot needs to be
invalidated, that is, when the slot persists after the user process is
terminated. Thus the message cannot be seen for temporary slots since
they are removed at process termination and no longer exist after
that.
At Wed, 08 Dec 2021 11:23:35 +0000, PG Bug reporting form <noreply(at)postgresql(dot)org> wrote in
> The core issue here then in our opinion is that Postgres server should log
> an error when the max_slot_wal_keep_size limit is reached for temporary
> replication slots as well as for permanent ones as otherwise
> users/administrators are presented only with non-descript connection
> termination errors which do not point to the actual cause of the problem.
If you mean the "invalidating slot" message by "an error", that
wouldn't happen since invalidation is actually doesn't happen. Or, we
could change the message like this. Does this make sense for you?
> LOG: terminating process 42601 to release temporary replication slot "pg_basebackup_42601"
> DETAIL: The slot will be dropped by the process termination.
> LOG: terminating process 3899 to release persistent replication slot "backup"
...
> LOG: invalidating slot "backup" because its restart_lsn 47198/1E000000 exceeds max_slot_wal_keep_size
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | James Pang (chaolpan) | 2021-12-13 07:06:16 | RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters |
Previous Message | Tomas Vondra | 2021-12-13 02:32:18 | Re: BUG #17334: Assert failed inside computeDistance() on gist index scanning |