Re: Pgoutput not capturing the generated columns

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-09-23 12:16:45
Message-ID: CALDaNm2jELQxHHCrfpp3FVdfrgJopNGoQQ_+uH=rj5iSDcNC2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 20 Sept 2024 at 17:15, Shubham Khanna
<khannashubham1197(at)gmail(dot)com> wrote:
>
> On Wed, Sep 11, 2024 at 8:55 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Here are a some more review comments for patch v30-0001.
> >
> > ======
> > src/sgml/ref/create_publication.sgml
> >
> > 1.
> > + <para>
> > + If the publisher-side column is also a generated column
> > then this option
> > + has no effect; the publisher column will be filled as normal with the
> > + publisher-side computed or default data.
> > + </para>
> >
> > It should say "subscriber-side"; not "publisher-side". The same was
> > already reported by Sawada-San [1].
> >
> > ~~~
> >
> > 2.
> > + <para>
> > + This parameter can only be set <literal>true</literal> if
> > <literal>copy_data</literal> is
> > + set to <literal>false</literal>.
> > + </para>
> >
> > IMO this limitation should be addressed by patch 0001 like it was
> > already done in the previous patches (e.g. v22-0002). I think
> > Sawada-san suggested the same [1].
> >
> > Anyway, 'copy_data' is not a PUBLICATION option, so the fact it is
> > mentioned like this without any reference to the SUBSCRIPTION seems
> > like a cut/paste error from the previous implementation.
> >
> > ======
> > src/backend/catalog/pg_publication.c
> >
> > 3. pub_collist_validate
> > - if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated)
> > - ereport(ERROR,
> > - errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> > - errmsg("cannot use generated column \"%s\" in publication column list",
> > - colname));
> > -
> >
> > Instead of just removing this ERROR entirely here, I thought it would
> > be more user-friendly to give a WARNING if the PUBLICATION's explicit
> > column list includes generated cols when the option
> > "publish_generated_columns" is false. This combination doesn't seem
> > like something a user would do intentionally, so just silently
> > ignoring it (like the current patch does) is likely going to give
> > someone unexpected results/grief.
> >
> > ======
> > src/backend/replication/logical/proto.c
> >
> > 4. logicalrep_write_tuple, and logicalrep_write_attrs:
> >
> > - if (att->attisdropped || att->attgenerated)
> > + if (att->attisdropped)
> > continue;
> >
> > Why aren't you also checking the new PUBLICATION option here and
> > skipping all gencols if the "publish_generated_columns" option is
> > false? Or is the BMS of pgoutput_column_list_init handling this case?
> > Maybe there should be an Assert for this?
> >
> > ======
> > src/backend/replication/pgoutput/pgoutput.c
> >
> > 5. send_relation_and_attrs
> >
> > - if (att->attisdropped || att->attgenerated)
> > + if (att->attisdropped)
> > continue;
> >
> > Same question as #4.
> >
> > ~~~
> >
> > 6. prepare_all_columns_bms and pgoutput_column_list_init
> >
> > + if (att->attgenerated && !pub->pubgencolumns)
> > + cols = bms_del_member(cols, i + 1);
> >
> > IIUC, the algorithm seems overly tricky filling the BMS with all
> > columns, before straight away conditionally removing the generated
> > columns. Can't it be refactored to assign all the correct columns
> > up-front, to avoid calling bms_del_member()?
> >
> > ======
> > src/bin/pg_dump/pg_dump.c
> >
> > 7. getPublications
> >
> > IIUC, there is lots of missing SQL code here (for all older versions)
> > that should be saying "false AS pubgencolumns".
> > e.g. compare the SQL with how "false AS pubviaroot" is used.
> >
> > ======
> > src/bin/pg_dump/t/002_pg_dump.pl
> >
> > 8. Missing tests?
> >
> > I expected to see a pg_dump test for this new PUBLICATION option.
> >
> > ======
> > src/test/regress/sql/publication.sql
> >
> > 9. Missing tests?
> >
> > How about adding another test case that checks this new option must be
> > "Boolean"?
> >
> > ~~~
> >
> > 10. Missing tests?
> >
> > --- error: generated column "d" can't be in list
> > +-- ok: generated columns can be in the list too
> > ALTER PUBLICATION testpub_fortable ADD TABLE testpub_tbl5 (a, d);
> > +ALTER PUBLICATION testpub_fortable DROP TABLE testpub_tbl5;
> >
> > (see my earlier comment #3)
> >
> > IMO there should be another test case for a WARNING here if the user
> > attempts to include generated column 'd' in an explicit PUBLICATION
> > column list while the "publish_generated-columns" is false.
> >
> > ======
> > [1] https://www.postgresql.org/message-id/CAD21AoA-tdTz0G-vri8KM2TXeFU8RCDsOpBXUBCgwkfokF7%3DjA%40mail.gmail.com
> >
>
> I have fixed all the comments. The attached patches contain the desired changes.
> Also the merging of 0001 and 0002 can be done once there are no
> comments on the patch to help in reviewing.

The warning message appears to be incorrect. Even though
publish_generated_columns is set to true, the warning indicates that
it is false.
CREATE TABLE t1 (a int, gen1 int GENERATED ALWAYS AS (a * 2) STORED);
postgres=# CREATE PUBLICATION pub1 FOR table t1(gen1) WITH
(publish_generated_columns=true);
WARNING: specified generated column "gen1" in publication column list
for publication with publish_generated_columns as false

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-09-23 12:35:45 Re: Increase of maintenance_work_mem limit in 64-bit Windows
Previous Message vignesh C 2024-09-23 12:10:58 Re: Pgoutput not capturing the generated columns