From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Cc: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date: | 2025-01-29 00:09:32 |
Message-ID: | CAD21AoADdcgjRgH9kyJnAAe_dkoxPx8DKwGA0PfVx6_qjADz=g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jan 28, 2025 at 1:39 AM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Sawada-san,
>
> I love the idea. I've roughly tested the patch and worked on my env.
> Here are initial comments...
Thank you for looking at the patch!
>
> 1. xloglevelworker.c
> ```
> +#include "replication/logicalxlog.h"
> ```
>
> xloglevelworker.c includes replication/logicalxlog.h, but it does not exist.
> The line had to be removed to build and test it.
>
> 2.
> ```
> +static void
> +writeUpdateWalLevel(int new_wal_level)
> +{
> + XLogBeginInsert();
> + XLogRegisterData((char *) (&new_wal_level), sizeof(bool));
> + XLogInsert(RM_XLOG_ID, XLOG_UPDATE_WAL_LEVEL);
> +}
> ```
>
> IIUC the data length should be sizeof(int) instead of sizeof(bool).
Agreed to fix them.
>
> 3.
> Is there a reason why the process does not wait till the archiver exits?
No. I didn't implement this part as the patch was just for
proof-of-concept. I think it would be better to wait for it to exit.
>
> 4.
> When I dumped wal files, I found that XLOG_UPDATE_WAL_LEVEL cannot be recognized:
>
> ```
> rmgr: XLOG len (rec/tot): 27/ 27, tx: 0, lsn: 0/03050838, prev 0/03050800, desc: UNKNOWN (f0) wal_level logical
> ```
>
> xlog_identify() must be updated as well.
Will fix.
>
> 5.
> When I changed "logical" to "replica", postgres outputs like below:
>
> ```
> LOG: received SIGHUP, reloading configuration files
> LOG: parameter "wal_level" changed to "replica"
> LOG: wal_level control worker started
> LOG: changing wal_level from "logical" to "replica"
> LOG: wal_level has been decreased to "replica"
> LOG: successfully changed wal_level from "logical" to "replica"
> ```
>
> ISTM that both postmaster and the wal_level control worker said something like
> "wal_level changed", which is bit strange for me. Since GUC can't be renamed,
> can we use another name for the wal_level control state?
I'm concerned that users could be confused if two different names
refer to substantially the same thing.
Having said that, I guess that we need to drastically change the
messages. For example, I think that the wal_level worker should say
something like "successfully made 'logical' wal_level effective"
instead of saying something like "changed wal_level value". Also,
users might not need gradual messages when increasing 'minimal' to
'logical' or decreasing 'logical' to 'minimal'.
>
> 6.
> With the patch present, the wal_level can be changed to the minimal even when the
> streaming replication is going. If we do that, the walsender exits immediately and
> the below FATAL appears periodically until the standby stops. Same things can be
> said for the logical replication:
>
> ```
> FATAL: streaming replication receiver "walreceiver" could not connect to the primary server:
> connection to server on socket "/tmp/.s.PGSQL.oooo" failed:
> FATAL: WAL senders require "wal_level" to be "replica" or "logical
> ```
>
> I know this is not a perfect, but can we avoid the issue by reject the GUC update
> if the walsender exists? Another approach is not to update the value when replication
> slots need to be invalidated.
Does it mean that we reject the config file from being reloaded in
that case? I have no idea how to reject it in a case where the
wal_level in postgresql.conf changed and the user did 'pg_ctl reload'.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2025-01-29 00:30:25 | Re: Adding skip scan (including MDAM style range skip scan) to nbtree |
Previous Message | Michael Paquier | 2025-01-28 23:53:17 | Re: [PATCH] Improve code coverage of network address functions |