From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Time delayed LR (WAS Re: logical replication restrictions) |
Date: | 2023-03-01 05:26:48 |
Message-ID: | CAD21AoBe2-TtjZibGAa5BPnZPj2LsMtofKGAO5rERt+GDGmtAQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 1, 2023 at 1:55 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Mar 1, 2023 at 8:18 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Mar 1, 2023 at 12:51 AM Hayato Kuroda (Fujitsu)
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> >
> > Thinking of side effects of this feature (no matter where we delay
> > applying the changes), on the publisher, vacuum cannot collect garbage
> > and WAL cannot be recycled. Is that okay in the first place? The point
> > is that the subscription setting affects the publisher. That is,
> > min_send_delay is specified on the subscriber but the symptoms that
> > could ultimately lead to a server crash appear on the publisher, which
> > sounds dangerous to me.
> >
> > Imagine a service or system like where there is a publication server
> > and it's somewhat exposed so that a user (or a subsystem) arbitrarily
> > can create a subscriber to replicate a subset of the data. A malicious
> > user can have the publisher crash by creating a subscription with,
> > say, min_send_delay = 20d. max_slot_wal_keep_size helps this situation
> > but it's -1 by default.
> >
>
> By publisher crash, do you mean due to the disk full situation, it can
> lead the publisher to stop/panic?
Exactly.
> Won't a malicious user can block the
> replication in other ways as well and let the publisher stall (or
> crash the publisher) even without setting min_send_delay? Basically,
> one needs to either disable the subscription or create a
> constraint-violating row in the table to make that happen. If the
> system is exposed for arbitrarily allowing the creation of a
> subscription then a malicious user can create a subscription similar
> to one existing subscription and block the replication due to
> constraint violations. I don't think it would be so easy to bypass the
> current system that a malicious user will be allowed to create/alter
> subscriptions arbitrarily.
Right. But a difference is that with min_send_delay, it's just to
create a subscription.
> Similarly, if there is a network issue
> (unreachable or slow), one will see similar symptoms. I think
> retention of data and WAL on publisher do rely on acknowledgment from
> subscribers and delay in that due to any reason can lead to the
> symptoms you describe above.
I think that piling up WAL files due to a slow network is a different
story since it's a problem not only on the subscriber side.
> We have documented at least one such case
> already where during Drop Subscription, if the network is not
> reachable then also, a similar problem can happen and users need to be
> careful about it [1].
Apart from a bad-use case example I mentioned, in general, piling up
WAL files due to the replication slot has many bad effects on the
system. I'm concerned that the side effect of this feature (at least
of the current design) is too huge compared to the benefit, and afraid
that users might end up using this feature without understanding the
side effect well. It might be okay if we thoroughly document it but
I'm not sure.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2023-03-01 06:05:49 | Re: Time delayed LR (WAS Re: logical replication restrictions) |
Previous Message | Harinath Kanchu | 2023-03-01 05:21:12 | LOG: invalid record length at <LSN> : wanted 24, got 0 |