Re: Pgoutput not capturing the generated columns

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-10-03 11:00:14
Message-ID: CAHv8Rj+1RDd7AnJNzOJXk--zcbTtU3nys=ZgU3ktB4e3DWbJgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 30, 2024 at 11:47 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are my review comments for patch v34-0001
>
> ======
> doc/src/sgml/ddl.sgml
>
> 1.
> - Generated columns are skipped for logical replication and cannot be
> - specified in a <command>CREATE PUBLICATION</command> column list.
> + Generated columns may be skipped during logical replication
> according to the
> + <command>CREATE PUBLICATION</command> parameter
> + <link linkend="sql-createpublication-params-with-publish-generated-columns">
> + <literal>publish_generated_columns</literal></link>.
>
> This information is not quite correct because it makes no mention of
> PUBLICATION column lists. OTOH I replaced all this paragraph in the
> 0003 patch anyhow, so maybe it is not worth worrying about this review
> comment.
>
> ======
> src/backend/catalog/pg_publication.c
>
> 2.
> - if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated)
> - ereport(ERROR,
> + if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated && !pubgencols)
> + ereport(WARNING,
> errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> - errmsg("cannot use generated column \"%s\" in publication column list",
> + errmsg("specified generated column \"%s\" in publication column list
> when publish_generated_columns as false",
> colname));
>
> Back when I proposed to have this WARNING I don't think there existed
> the idea that the PUBLICATION column list would override the
> 'publish_generated_columns' parameter. But now that this is the
> implementation, I am no longer 100% sure if a warning should be given
> at all because this can be a perfectly valid combination. What do
> others think?
>
> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> 3. prepare_all_columns_bms
>
> 3a.
> nit - minor rewording function comment
>
> ~
>
> 3b.
> I am not sure this is a good function name, particularly since it does
> not return "all" columns. Can you name it more for what it does?
>
> ~~~
>
> 4. pgoutput_column_list_init
>
> /*
> - * If the publication is FOR ALL TABLES then it is treated the same as
> - * if there are no column lists (even if other publications have a
> - * list).
> + * To handle cases where the publish_generated_columns option isn't
> + * specified for all tables in a publication, the pubgencolumns check
> + * needs to be performed. In such cases, we must create a column list
> + * that excludes generated columns.
> */
> - if (!pub->alltables)
> + if (!pub->alltables || !pub->pubgencols)
>
> 4a.
> That comment still doesn't make much sense to me:
>
> e.g.1. "To handle cases where the publish_generated_columns option
> isn't specified for all tables in a publication". What is this trying
> to say? A publication parameter is per-publication, so it is always
> "for all tables in a publication".
>
> e.g.2. " the pubgencolumns check" -- what is the pubgencols check?
>
> Please rewrite the comment more clearly to explain the logic: What is
> it doing? Why is it doing it?
>
> ~
>
> 4b.
> + if (!pub->alltables || !pub->pubgencols)
>
> As mentioned in a previous review, there is too much negativity in
> conditions like this. I think anything you can do to reverse all the
> negativity will surely improve the readability of this function. See
> [1 - #11a]
>
>
> ~
>
> 5.
> - cols = pub_collist_to_bitmapset(cols, cfdatum,
> - entry->entry_cxt);
> + if (!pub_rel_has_collist)
> + cols = pub_collist_to_bitmapset(cols, cfdatum, entry->entry_cxt);
> + else
> + cols = prepare_all_columns_bms(data, entry, desc);
>
> Hm. Is that correct? The if/else is all backwards from what I would
> have expected. IIUC the variable 'pub_rel_has_collist' is assigned to
> the opposite value of what the name says, so then you are using it all
> backwards to make it work.
>
> nit - I have changed the code in the attachment how I think it should
> be. Please check it makes sense.
>
> ~
>
> 6.
> + /* Get the number of live attributes. */
> + for (i = 0; i < desc->natts; i++)
>
> nit - use a for-loop variable 'i'.
>
> ======
> src/bin/pg_dump/pg_dump.c
>
> 7.
> + else if (fout->remoteVersion >= 130000)
> + appendPQExpBufferStr(query,
> + "SELECT p.tableoid, p.oid, p.pubname, "
> + "p.pubowner, "
> + "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
> p.pubtruncate, p.pubviaroot, false AS pubviagencols "
> "FROM pg_publication p");
> else if (fout->remoteVersion >= 110000)
> appendPQExpBufferStr(query,
> "SELECT p.tableoid, p.oid, p.pubname, "
> "p.pubowner, "
> - "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
> p.pubtruncate, false AS pubviaroot "
> + "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
> p.pubtruncate, false AS pubviaroot, false AS pubviagencols "
> "FROM pg_publication p");
> else
> appendPQExpBufferStr(query,
> "SELECT p.tableoid, p.oid, p.pubname, "
> "p.pubowner, "
> - "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS
> pubtruncate, false AS pubviaroot "
> + "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS
> pubtruncate, false AS pubviaroot, false AS pubviagencols "
> "FROM pg_publication p");
>
> These changes are all wrong due to a typo. There is no such column as
> 'pubviagencols'. I made these changes in the nit attachment. Please
> check it for correctness.
>
> s/pubviagencols/pubgencols/
>
> ======
> src/test/regress/sql/publication.sql
>
> 8.
> --- error: generated column "d" can't be in list
> +-- ok: generated columns can be in the list too
>
> nit - name the generated column "d", to clarify this comment
>
> ~~~
>
> 9.
> +-- Test the publication 'publish_generated_columns' parameter enabled
> or disabled for different combinations.
>
> nit - add another "======" separator comment before the new test
> nit - some minor adjustments to whitespace and comments
>
> ~~~
>
> 10.
> +-- No gencols in column list with 'publish_generated_columns'=false.
> +CREATE PUBLICATION pub3 WITH (publish_generated_columns=false);
> +-- ALTER PUBLICATION to add gencols to column list.
> +ALTER PUBLICATION pub3 ADD TABLE gencols(a, gen1);
>
> nit - by adding and removing collist for the existing table, we don't
> need to have a 'pub3'.
>
> ======
> src/test/subscription/t/031_column_list.pl
>
> 11.
> It is not clear to me -- is there, or is there not yet any test case
> for the multiple publication issues that were discussed previously?
> e.g. when the same table has gencols but there are multiple
> publications for the same subscription and the
> 'publish_generated_columns' parameter or column lists conflict.
>
I have moved the new test cases to the 011_generated.pl file and
created a separate patch . I will post the TAP-TESTS once these
patches get fixed.
I have fixed all the given comments. The attached v36-0001 patch
contain the desired changes.

Thanks and Regards,
Shubham Khanna.

Attachment Content-Type Size
v36-0002-Support-replication-of-generated-column-during-i.patch application/octet-stream 17.5 KB
v36-0003-DOCS-Generated-Column-Replication.patch application/octet-stream 12.4 KB
v36-0001-Enable-support-for-publish_generated_columns-opt.patch application/octet-stream 82.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shubham Khanna 2024-10-03 11:04:56 Re: Pgoutput not capturing the generated columns
Previous Message Daniel Gustafsson 2024-10-03 10:17:03 Re: Retire support for OpenSSL 1.1.1 due to raised API requirements