From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: bogus: logical replication rows/cols combinations |
Date: | 2022-05-02 05:31:13 |
Message-ID: | CAA4eK1LGX-ig==QyL+==nKvcAS3qFU7=NiKL77ukUT-Q_4XncQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, May 2, 2022 at 3:27 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 4/30/22 12:11, Amit Kapila wrote:
> > On Sat, Apr 30, 2022 at 3:01 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> >>
> >> My proposal is that if users want to define multiple publications, and
> >> their definitions conflict in a way that would behave ridiculously (==
> >> bound to cause data inconsistencies eventually), an error should be
> >> thrown. Maybe we will not be able to catch all bogus cases, but we can
> >> be prepared for the most obvious ones, and patch later when we find
> >> others.
> >>
> >
> > I agree with throwing errors for obvious/known bogus cases but do we
> > want to throw errors or restrict the combining of column lists when
> > row filters are present in all cases? See some examples [1 ] where it
> > may be valid to combine them.
> >
>
> I think there are three challenges:
>
> (a) Deciding what's an obvious bug or an unsupported case (e.g. because
> it's not clear what's the correct behavior / way to merge column lists).
>
> (b) When / where to detect the issue.
>
> (c) Making sure this does not break/prevent existing use cases.
>
>
> As I said before [1], I think the issue stems from essentially allowing
> DML to have different row filters / column lists. So we could forbid
> publications to specify WITH (publish=...) and one of the two features,
>
I don't think this is feasible for row filters because that would mean
publishing all actions because we have a restriction that all columns
referenced in the row filter expression are part of the REPLICA
IDENTITY index. This restriction is only valid for updates/deletes, so
if we allow all pubactions then this will be imposed on inserts as
well. A similar restriction is there for column lists as well, so I
don't think we can do it there as well. Do you have some idea to
address it?
> or make sure subscription does not combine multiple such publications.
>
Yeah, or don't allow to define such publications in the first place so
that different subscriptions can't combine them but I guess that might
forbid some useful cases as well where publication may not get
combined with other publications.
> The second option has the annoying consequence that it makes this
> useless for the "data redaction" use case I described in [2], because
> that relies on combining multiple publications.
>
True, but as a workaround users can create different subscriptions for
different publications.
> Furthermore, what if the publications change after the subscriptions get
> created? Will we be able to detect the error etc.?
>
I think from that apart from 'Create Subscription', the same check
needs to be added for Alter Subscription ... Refresh, Alter
Subscription ... Enable.
In the publication side, we need an additional check in Alter
Publication ... SET table variant. One idea is that we get all other
publications for which the corresponding relation is defined. And then
if we find anything which we don't want to allow then we can throw an
error. This will forbid some useful cases as well as mentioned above.
So, the other possibility is to expose all publications for a
walsender, and then we can find the exact set of publications where
the current publication is used with other publications and we can
check only those publications. So, if we have three walsenders
(walsnd1: pub1, pub2; walsnd2 pub2; walsnd3: pub2, pub3) in the system
and we are currently altering publication pub1 then we need to check
only pub3 for any conflicting conditions. Yet another simple way could
be that we don't allow to change column list via Alter Publication ...
Set variant because the other variants anyway need REFRESH publication
which we have covered.
I think it is tricky to decide what exactly we want to forbid, so, we
may want to follow something simple like if the column list and row
filters for a table are different in the required set of publications
then we treat it as an unsupported case. I think this will prohibit
some useful cases but should probably forbid the cases we are worried
about here.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2022-05-02 05:36:23 | Re: bogus: logical replication rows/cols combinations |
Previous Message | Noah Misch | 2022-05-02 04:27:18 | Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set |