Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY
Date: 2024-11-14 06:52:44
Message-ID: CALDaNm0=aKhjxNb9L9L1bCCQD+38JYGGtOZLm4QHjxAmLUg8uQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 13 Nov 2024 at 11:15, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
>
> Thanks for providing the comments.
>
> On Tue, 12 Nov 2024 at 12:52, Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Friday, November 8, 2024 7:06 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> > >
> > > Hi Amit,
> > >
> > > On Thu, 7 Nov 2024 at 11:37, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Nov 5, 2024 at 12:53 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
> > > wrote:
> > > > >
> > > > > To avoid the issue, we can disallow UPDATE/DELETE on table with
> > > > > unpublished generated column as REPLICA IDENTITY. I have attached a
> > > > > patch for the same.
> > > > >
> > > >
> > > > +CREATE PUBLICATION pub_gencol FOR TABLE testpub_gencol; UPDATE
> > > > +testpub_gencol SET a = 100 WHERE a = 1;
> > > > +ERROR: cannot update table "testpub_gencol"
> > > > +DETAIL: Column list used by the publication does not cover the
> > > > replica identity.
> > > >
> > > > This is not a correct ERROR message as the publication doesn't have
> > > > any column list associated with it. You have added the code to detect
> > > > this in the column list code path which I think is not required. BTW,
> > > > you also need to consider the latest commit 7054186c4e for this. I
> > > > guess you need to keep another flag in PublicationDesc to detect this
> > > > and then give an appropriate ERROR.
> > >
> > > I have addressed the comments and provided an updated patch. Also, I am
> > > currently working to fix this issue in back branches.
> >
> > Thanks for the patch. I am reviewing it and have some initial comments:
> >
> >
> > 1.
> > + char attgenerated = get_attgenerated(relid, attnum);
> > +
> >
> > I think it's unnecessary to initialize attgenerated here because the value will
> > be overwritten if pubviaroot is true anyway. Also, the get_attgenerated()
> > is not cheap.
> >
> Fixed
>
> > 2.
> >
> > I think the patch missed to check the case when table is marked REPLICA
> > IDENTITY FULL, and generated column is not published:
> >
> > CREATE TABLE testpub_gencol (a INT, b INT GENERATED ALWAYS AS (a + 1) STORED NOT NULL);
> > ALTER TABLE testpub_gencol REPLICA IDENTITY FULL;
> > CREATE PUBLICATION pub_gencol FOR TABLE testpub_gencol;
> > UPDATE testpub_gencol SET a = 2;
> >
> > I expected the UPDATE to fail in above case, but it can still pass after applying the patch.
> >
> Fixed
>
> > 3.
> >
> > + * If the publication is FOR ALL TABLES we can skip the validation.
> > + */
> >
> > This comment seems not clear to me, could you elaborate a bit more on this ?
> >
> I missed to handle the case FOR ALL TABLES. Have removed the comment.
>
> > 4.
> >
> > Also, I think the patch does not handle the FOR ALL TABLE case correctly:
> >
> > CREATE TABLE testpub_gencol (a INT, b INT GENERATED ALWAYS AS (a + 1) STORED NOT NULL);
> > CREATE UNIQUE INDEX testpub_gencol_idx ON testpub_gencol (b);
> > ALTER TABLE testpub_gencol REPLICA IDENTITY USING index testpub_gencol_idx;
> > CREATE PUBLICATION pub_gencol FOR ALL TABLEs;
> > UPDATE testpub_gencol SET a = 2;
> >
> > I expected the UPDATE to fail in above case as well.
> >
> Fixed
>
> > 5.
> >
> > + else if (cmd == CMD_UPDATE && !pubdesc.replident_has_valid_gen_cols)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> > + errmsg("cannot update table \"%s\"",
> > + RelationGetRelationName(rel)),
> > + errdetail("REPLICA IDENTITY consists of an unpublished generated column.")));
> >
> > I think it would be better to use lower case "replica identity" to consistent
> > with other existing messages.
> >
> Fixed
>
> I have attached the updated patch here.

Few comments:
1) In the first check relation->rd_rel->relispartition also is checked
whereas in the below it is not checked, shouldn't the same check be
there below to avoid few of the function calls which are not required:
+ if (pubviaroot && relation->rd_rel->relispartition)
+ {
+ publish_as_relid =
GetTopMostAncestorInPublication(pubid, ancestors, NULL);
+
+ if (!OidIsValid(publish_as_relid))
+ publish_as_relid = relid;
+ }
+

+ if (pubviaroot)
+ {
+ /* attribute name in the child table */
+ char *colname =
get_attname(relid, attnum, false);
+
+ /*
+ * Determine the attnum for the
attribute name in parent (we
+ * are using the column list defined
on the parent).
+ */
+ attnum = get_attnum(publish_as_relid, colname);
+ attgenerated =
get_attgenerated(publish_as_relid, attnum);
+ }
+ else
+ attgenerated = get_attgenerated(relid, attnum);

2) I think we could use check_and_fetch_column_list to see that it is
not a column list publication instead of below code:
+ if (!puballtables)
+ {
+ tuple = SearchSysCache2(PUBLICATIONRELMAP,
+
ObjectIdGetDatum(publish_as_relid),
+
ObjectIdGetDatum(pubid));
+
+ if (!HeapTupleIsValid(tuple))
+ return false;
+
+ (void) SysCacheGetAttr(PUBLICATIONRELMAP, tuple,
+
Anum_pg_publication_rel_prattrs,
+ &isnull);
+
+ ReleaseSysCache(tuple);
+ }
+
+ if(puballtables || isnull)

3) Since there is only a single statement, remove the enclosing parenthisis:
+ if (!pubform->pubgencols &&
+ (pubform->pubupdate || pubform->pubdelete) &&
+ replident_has_unpublished_gen_col(pubid,
relation, ancestors,
+
pubform->pubviaroot, pubform->puballtables))
+ {
+ pubdesc->replident_has_valid_gen_cols = false;
+ }

4) Pgindent should be run there are few issues:
4.a)
+extern bool replident_has_unpublished_gen_col(Oid pubid, Relation relation,
+
List *ancestors, bool pubviaroot, bool
puballtables);
4.b)
+ }
+
+ if(puballtables || isnull)
+ {
+ int x;
+ Bitmapset *idattrs = NULL;
4.c)
+ * generated column we should error out.
+ */
+ if(relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL &&
+ relation->rd_att->constr &&
relation->rd_att->constr->has_generated_stored)
+ result = true;
4.d)
+ while ((x = bms_next_member(idattrs, x)) >= 0)
+ {
+ AttrNumber attnum = (x +
FirstLowInvalidHeapAttributeNumber);
+ char attgenerated;

5) You could do this in a single line comment:
+ /*
+ * Check if any REPLICA IDENTITY column is an generated column.
+ */
+ while ((x = bms_next_member(idattrs, x)) >= 0)

6) I felt one of update or delete is enough in this case as the code
path is same:
+UPDATE testpub_gencol SET a = 100 WHERE a = 1;
+DELETE FROM testpub_gencol WHERE a = 100;
+
+-- error - generated column "b" is not published and REPLICA IDENTITY
is set FULL
+ALTER TABLE testpub_gencol REPLICA IDENTITY FULL;
+UPDATE testpub_gencol SET a = 100 WHERE a = 1;
+DELETE FROM testpub_gencol WHERE a = 100;
+DROP PUBLICATION pub_gencol;
+
+-- ok - generated column "b" is published and is part of REPLICA IDENTITY
+CREATE PUBLICATION pub_gencol FOR TABLE testpub_gencol with
(publish_generated_columns = true);
+UPDATE testpub_gencol SET a = 100 WHERE a = 1;
+DELETE FROM testpub_gencol WHERE a = 100;

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sutou Kouhei 2024-11-14 07:19:48 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Nisha Moond 2024-11-14 06:42:33 Re: DOCS - pg_replication_slot . Fix the 'inactive_since' description