Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-01-10 08:33:57
Message-ID: CAD21AoAuxr22MEqdt2YHbE9PaiPgc3=3f_1oXEeLnXXkO3uiYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 9, 2025 at 3:29 AM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Tue, Dec 31, 2024 at 10:15 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Hi all,
> >
> > Logical decoding (and logical replication) are available only when
> > wal_level = logical. As the documentation says[1], Using the 'logical'
> > level increases the WAL volume which could negatively affect the
> > performance. For that reason, users might want to start with using
> > 'replica', but when they want to use logical decoding they need a
> > server restart to increase wal_level to 'logical'. My goal is to allow
> > users who are using 'replica' level to use logical decoding without a
> > server restart. There are other GUC parameters related to logical
> > decoding and logical replication such as max_wal_senders,
> > max_logical_replication_workers, and max_replication_slots, but even
> > if users set these parameters >0, there would not be a noticeable
> > performance impact. And their default values are already >0. So I'd
> > like to focus on making only the wal_level dynamic GUC parameter.
> > There are several earlier discussions[2][3] but no one has submitted
> > patches unless I'm missing something.
> >
> > The first idea I came up with is to make the wal_level a PGC_SIGHUP
> > parameter. However, it affects not only setting 'replica' to 'logical'
> > but also setting 'minimal' to 'replica' or higher. I'm not sure the
> > latter case is common and it might require a checkpoint. I don't want
> > to make the patch complex for uncommon cases.
> >
> > The second idea is to somehow allow both WAL-logging logical info and
> > logical decoding even when wal_level is 'replica'. I've attached a PoC
> > patch for that. The patch introduces new SQL functions such as
> > pg_activate_logical_decoding() and pg_deactivate_logical_decoding().
> > These functions are available only when wal_level is 'repilca'(or
> > higher). In pg_activate_logical_decoding(), we set the status of
> > logical decoding stored on the shared memory from 'disabled' to
> > 'xlog-logical-info', allowing all processes to write logical
> > information to WAL records for logical decoding. But the logical
> > decoding is still not allowed. Once we confirm all in-progress
> > transactions completed, we switch the status to
> > 'logical-decoding-ready', meaning that users can create logical
> > replication slots and use logical decoding.
> >
> > Overall, with the patch, there are two ways to enable logical
> > decoding: setting wal_level to 'logical' and calling
> > pg_activate_logical_decoding() when wal_level is 'replica'. I left the
> > 'logical' level for backward compatibility and for users who want to
> > enable the logical decoding without calling that SQL function. If we
> > can automatically enable the logical decoding when creating the first
> > logical replication slot, probably we no longer need the 'logical'
> > level. There is room to discuss the user interface. Feedback is very
> > welcome.
> >
>
> If a server is running at minimal wal_level and they want to enable
> logical replication, they would still need a server restart. That
> would be rare but not completely absent.

Currently we don't allow the server to start with the 'minimal' level
and max_wal_senders > 0. Even if we support changing 'minimal' to
'logical' without a server restart, we still need a server restart to
increase max_wal_senders for users who want to use logical
replication. Or we need to eliminate this restriction too. I guess it
would be too complex for such uncommon use cases.

>
> Our documentation says "wal_level determines how much information is
> written to the WAL.". Users would may not expect that the WAL amount
> changes while wal_level = replica depending upon whether logical
> decoding is possible. It may be possible to set the expectations right
> by changing the documentation. It's not in the patch, so I am not sure
> whether this is considered.

We should mention that in the doc. The WAL amount changes depending on
not only wal_level but also other parameters such as wal_log_hints and
full_page_writes.

> Cloud providers do not like multiple ways of changing configuration
> esp. when they can not control it. See [1]. Changing wal_level through
> a SQL function may fit the same category.

Thank you for pointing it out. This would support the idea of
automatically enabling logical decoding.

> I agree that it would be a lot of work to make all combinations of
> wal_level changes work, but changing wal_level through SIGHUP looks
> like a cleaner solution. Is there way that we make the GUC SIGHUP but
> disallow certain combinations of old and new values?

While I agree that it's cleaner I think there is no way today. I think
we need to invent something for that.

Another idea would be to have another SIGHUP GUC parameter to control
logical info as another way to enable logical info WAL-logging while
trying to deprecate the 'logical' level over some releases. While it
doesn't need a SQL function, it could confuse users since we will
require two GUC parameters for doing things that we used to use one
GUC parameter.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ryo Kanbayashi 2025-01-10 08:45:00 Re: ecpg command does not warn COPY ... FROM STDIN;
Previous Message Bertrand Drouvot 2025-01-10 08:23:46 Re: Make pg_stat_io view count IOs as bytes instead of blocks