Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Subject: Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY
Date: 2024-11-12 10:27:04
Message-ID: CAA4eK1KOPDKP0gvoCKYJPLSKRMWjrjj1cZvcGeH13U_t+acRRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 12, 2024 at 2:15 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2024-Nov-09, Amit Kapila wrote:
>
> > On Fri, Nov 8, 2024 at 5:17 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > >
> > > On 2024-Nov-07, Amit Kapila wrote:
> > >
> > > > BTW, I was thinking as to how to fix it on back branches and it seems
> > > > we should restrict to define REPLICA IDENTITY on stored generated
> > > > columns in the first place in back branches as those can't be
> > > > replicated.
>
> > > I think a blanket restriction of this sort is not a good idea (at least
> > > in back branches), because there might be people using replica
> > > identities with stacks other than pgoutput.
> >
> > Do you mean to say that people using plugins other than pgoutput may
> > already be sending generated columns, so defining replica identity
> > should be okay for them?
>
> Yes.
>
> > If we don't want to add a restriction to not create replica identity
> > on generated columns then I think the solution similar to HEAD should
> > be okay which is to restrict UPDATE/DELETE in such cases.
>
> Hmm, I don't know about this. Maybe nobody cares, but I'm uneasy about
> it. I'm wondering about hypothetical cases where people is already
> using this combination of features in stable branches, without pgoutput.
> I think it's not great to add restrictions that didn't exist when they
> upgraded to some stable branch. In branch master it's probably okay,
> because they'll have to test before upgrading and they'll realize the
> problem and have the chance to adjust (or complain) before calling the
> upgrade good. But if we do that for stable branches, we'd deprive them
> of the ability to do minor upgrades, which would be Not Good.
>
> So, another option is to do nothing for stable branches.
>

Fair enough. The other point in favor of that option is that nobody
has reported this problem yet but my guess is that they would have
probably not used such a combination at least with pgoutput plugin
otherwise, they would have faced the ERRORs on the subscriber. So, we
can do this only for HEAD and decide on the fix if anyone ever reports
this problem.

> > > Would it work to enforce the restriction when such a table is added
> > > to a publication?
> >
> > But what if somebody defines REPLICA IDENTITY on the generated column
> > after adding the table to the publication?
>
> Well, maybe we can restrict the change of REPLICA IDENTITY if the table
> is already in a pgoutput publication?
>

What about the PRIMARY KEY case as shared in my later email? Even
apart from that the plugin is decided via slot, so we won't be able to
detect from table<->publication relationship.

> On 2024-Nov-12, Amit Kapila wrote:
>
> > Also, another point against restricting defining REPLICA IDENTITY on
> > generated columns is that we do allow generated columns to be PRIMARY
> > KEY which is a DEFAULT for REPLICA IDENTITY, so that also needs to be
> > restricted. That won't be a good idea.
>
> Oh, that's a good point too.
>
> It's not clear to me why doesn't pgoutput cope with generated columns in
> replica identities. Maybe that can be reconsidered?
>

In stable branches, we intentionally skip publishing generated columns
as we assumed that the subscriber side also had a generated column.
So, sending it would be a waste of network bandwidth. OTOH, when one
tries to replicate the changes to some other database that didn't have
the generated columns concept, it would create a problem. So we
developed a new feature for HEAD as part of commits 745217a051 and
7054186c4e which allows the publication of generated columns when
explicitly specified by the users.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2024-11-12 10:30:51 Re: Add html-serve target to autotools and meson
Previous Message Ashutosh Bapat 2024-11-12 10:26:39 Re: Separate memory contexts for relcache and catcache