Re: Pgoutput not capturing the generated columns

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-09-27 14:31:28
Message-ID: CAHv8RjJkUdYCdK_bL3yvEV=zKrA2dsnZYa1VMT2H5v0+qbaGbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 23, 2024 at 5:56 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Fri, 20 Sept 2024 at 17:15, Shubham Khanna
> <khannashubham1197(at)gmail(dot)com> wrote:
> >
> > On Wed, Sep 11, 2024 at 8:55 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > >
> > > Here are a some more review comments for patch v30-0001.
> > >
> > > ======
> > > src/sgml/ref/create_publication.sgml
> > >
> > > 1.
> > > + <para>
> > > + If the publisher-side column is also a generated column
> > > then this option
> > > + has no effect; the publisher column will be filled as normal with the
> > > + publisher-side computed or default data.
> > > + </para>
> > >
> > > It should say "subscriber-side"; not "publisher-side". The same was
> > > already reported by Sawada-San [1].
> > >
> > > ~~~
> > >
> > > 2.
> > > + <para>
> > > + This parameter can only be set <literal>true</literal> if
> > > <literal>copy_data</literal> is
> > > + set to <literal>false</literal>.
> > > + </para>
> > >
> > > IMO this limitation should be addressed by patch 0001 like it was
> > > already done in the previous patches (e.g. v22-0002). I think
> > > Sawada-san suggested the same [1].
> > >
> > > Anyway, 'copy_data' is not a PUBLICATION option, so the fact it is
> > > mentioned like this without any reference to the SUBSCRIPTION seems
> > > like a cut/paste error from the previous implementation.
> > >
> > > ======
> > > src/backend/catalog/pg_publication.c
> > >
> > > 3. pub_collist_validate
> > > - if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated)
> > > - ereport(ERROR,
> > > - errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> > > - errmsg("cannot use generated column \"%s\" in publication column list",
> > > - colname));
> > > -
> > >
> > > Instead of just removing this ERROR entirely here, I thought it would
> > > be more user-friendly to give a WARNING if the PUBLICATION's explicit
> > > column list includes generated cols when the option
> > > "publish_generated_columns" is false. This combination doesn't seem
> > > like something a user would do intentionally, so just silently
> > > ignoring it (like the current patch does) is likely going to give
> > > someone unexpected results/grief.
> > >
> > > ======
> > > src/backend/replication/logical/proto.c
> > >
> > > 4. logicalrep_write_tuple, and logicalrep_write_attrs:
> > >
> > > - if (att->attisdropped || att->attgenerated)
> > > + if (att->attisdropped)
> > > continue;
> > >
> > > Why aren't you also checking the new PUBLICATION option here and
> > > skipping all gencols if the "publish_generated_columns" option is
> > > false? Or is the BMS of pgoutput_column_list_init handling this case?
> > > Maybe there should be an Assert for this?
> > >
> > > ======
> > > src/backend/replication/pgoutput/pgoutput.c
> > >
> > > 5. send_relation_and_attrs
> > >
> > > - if (att->attisdropped || att->attgenerated)
> > > + if (att->attisdropped)
> > > continue;
> > >
> > > Same question as #4.
> > >
> > > ~~~
> > >
> > > 6. prepare_all_columns_bms and pgoutput_column_list_init
> > >
> > > + if (att->attgenerated && !pub->pubgencolumns)
> > > + cols = bms_del_member(cols, i + 1);
> > >
> > > IIUC, the algorithm seems overly tricky filling the BMS with all
> > > columns, before straight away conditionally removing the generated
> > > columns. Can't it be refactored to assign all the correct columns
> > > up-front, to avoid calling bms_del_member()?
> > >
> > > ======
> > > src/bin/pg_dump/pg_dump.c
> > >
> > > 7. getPublications
> > >
> > > IIUC, there is lots of missing SQL code here (for all older versions)
> > > that should be saying "false AS pubgencolumns".
> > > e.g. compare the SQL with how "false AS pubviaroot" is used.
> > >
> > > ======
> > > src/bin/pg_dump/t/002_pg_dump.pl
> > >
> > > 8. Missing tests?
> > >
> > > I expected to see a pg_dump test for this new PUBLICATION option.
> > >
> > > ======
> > > src/test/regress/sql/publication.sql
> > >
> > > 9. Missing tests?
> > >
> > > How about adding another test case that checks this new option must be
> > > "Boolean"?
> > >
> > > ~~~
> > >
> > > 10. Missing tests?
> > >
> > > --- error: generated column "d" can't be in list
> > > +-- ok: generated columns can be in the list too
> > > ALTER PUBLICATION testpub_fortable ADD TABLE testpub_tbl5 (a, d);
> > > +ALTER PUBLICATION testpub_fortable DROP TABLE testpub_tbl5;
> > >
> > > (see my earlier comment #3)
> > >
> > > IMO there should be another test case for a WARNING here if the user
> > > attempts to include generated column 'd' in an explicit PUBLICATION
> > > column list while the "publish_generated-columns" is false.
> > >
> > > ======
> > > [1] https://www.postgresql.org/message-id/CAD21AoA-tdTz0G-vri8KM2TXeFU8RCDsOpBXUBCgwkfokF7%3DjA%40mail.gmail.com
> > >
> >
> > I have fixed all the comments. The attached patches contain the desired changes.
> > Also the merging of 0001 and 0002 can be done once there are no
> > comments on the patch to help in reviewing.
>
> The warning message appears to be incorrect. Even though
> publish_generated_columns is set to true, the warning indicates that
> it is false.
> CREATE TABLE t1 (a int, gen1 int GENERATED ALWAYS AS (a * 2) STORED);
> postgres=# CREATE PUBLICATION pub1 FOR table t1(gen1) WITH
> (publish_generated_columns=true);
> WARNING: specified generated column "gen1" in publication column list
> for publication with publish_generated_columns as false
>

I have fixed the warning message. The attached patches contain the
desired changes.

Thanks and Regards,
Shubham Khanna.

Attachment Content-Type Size
v34-0003-DOCS-Generated-Column-Replication.patch application/octet-stream 11.7 KB
v34-0001-Enable-support-for-publish_generated_columns-opt.patch application/octet-stream 83.0 KB
v34-0002-Support-replication-of-generated-column-during-i.patch application/octet-stream 18.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shubham Khanna 2024-09-27 14:39:58 Re: Pgoutput not capturing the generated columns
Previous Message Roberto Mello 2024-09-27 14:26:38 Re: Changing the default random_page_cost value