Re: Pgoutput not capturing the generated columns

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-09-30 06:17:05
Message-ID: CAHut+Ps8wcnBHoDf0y8wQKoP=Uk7imQ7_X6HNfvr-KLqLe6C1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are my review comments for patch v34-0001

======
doc/src/sgml/ddl.sgml

1.
- Generated columns are skipped for logical replication and cannot be
- specified in a <command>CREATE PUBLICATION</command> column list.
+ Generated columns may be skipped during logical replication
according to the
+ <command>CREATE PUBLICATION</command> parameter
+ <link linkend="sql-createpublication-params-with-publish-generated-columns">
+ <literal>publish_generated_columns</literal></link>.

This information is not quite correct because it makes no mention of
PUBLICATION column lists. OTOH I replaced all this paragraph in the
0003 patch anyhow, so maybe it is not worth worrying about this review
comment.

======
src/backend/catalog/pg_publication.c

2.
- if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated)
- ereport(ERROR,
+ if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated && !pubgencols)
+ ereport(WARNING,
errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
- errmsg("cannot use generated column \"%s\" in publication column list",
+ errmsg("specified generated column \"%s\" in publication column list
when publish_generated_columns as false",
colname));

Back when I proposed to have this WARNING I don't think there existed
the idea that the PUBLICATION column list would override the
'publish_generated_columns' parameter. But now that this is the
implementation, I am no longer 100% sure if a warning should be given
at all because this can be a perfectly valid combination. What do
others think?

======
src/backend/replication/pgoutput/pgoutput.c

3. prepare_all_columns_bms

3a.
nit - minor rewording function comment

~

3b.
I am not sure this is a good function name, particularly since it does
not return "all" columns. Can you name it more for what it does?

~~~

4. pgoutput_column_list_init

/*
- * If the publication is FOR ALL TABLES then it is treated the same as
- * if there are no column lists (even if other publications have a
- * list).
+ * To handle cases where the publish_generated_columns option isn't
+ * specified for all tables in a publication, the pubgencolumns check
+ * needs to be performed. In such cases, we must create a column list
+ * that excludes generated columns.
*/
- if (!pub->alltables)
+ if (!pub->alltables || !pub->pubgencols)

4a.
That comment still doesn't make much sense to me:

e.g.1. "To handle cases where the publish_generated_columns option
isn't specified for all tables in a publication". What is this trying
to say? A publication parameter is per-publication, so it is always
"for all tables in a publication".

e.g.2. " the pubgencolumns check" -- what is the pubgencols check?

Please rewrite the comment more clearly to explain the logic: What is
it doing? Why is it doing it?

~

4b.
+ if (!pub->alltables || !pub->pubgencols)

As mentioned in a previous review, there is too much negativity in
conditions like this. I think anything you can do to reverse all the
negativity will surely improve the readability of this function. See
[1 - #11a]

~

5.
- cols = pub_collist_to_bitmapset(cols, cfdatum,
- entry->entry_cxt);
+ if (!pub_rel_has_collist)
+ cols = pub_collist_to_bitmapset(cols, cfdatum, entry->entry_cxt);
+ else
+ cols = prepare_all_columns_bms(data, entry, desc);

Hm. Is that correct? The if/else is all backwards from what I would
have expected. IIUC the variable 'pub_rel_has_collist' is assigned to
the opposite value of what the name says, so then you are using it all
backwards to make it work.

nit - I have changed the code in the attachment how I think it should
be. Please check it makes sense.

~

6.
+ /* Get the number of live attributes. */
+ for (i = 0; i < desc->natts; i++)

nit - use a for-loop variable 'i'.

======
src/bin/pg_dump/pg_dump.c

7.
+ else if (fout->remoteVersion >= 130000)
+ appendPQExpBufferStr(query,
+ "SELECT p.tableoid, p.oid, p.pubname, "
+ "p.pubowner, "
+ "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
p.pubtruncate, p.pubviaroot, false AS pubviagencols "
"FROM pg_publication p");
else if (fout->remoteVersion >= 110000)
appendPQExpBufferStr(query,
"SELECT p.tableoid, p.oid, p.pubname, "
"p.pubowner, "
- "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
p.pubtruncate, false AS pubviaroot "
+ "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete,
p.pubtruncate, false AS pubviaroot, false AS pubviagencols "
"FROM pg_publication p");
else
appendPQExpBufferStr(query,
"SELECT p.tableoid, p.oid, p.pubname, "
"p.pubowner, "
- "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS
pubtruncate, false AS pubviaroot "
+ "p.puballtables, p.pubinsert, p.pubupdate, p.pubdelete, false AS
pubtruncate, false AS pubviaroot, false AS pubviagencols "
"FROM pg_publication p");

These changes are all wrong due to a typo. There is no such column as
'pubviagencols'. I made these changes in the nit attachment. Please
check it for correctness.

s/pubviagencols/pubgencols/

======
src/test/regress/sql/publication.sql

8.
--- error: generated column "d" can't be in list
+-- ok: generated columns can be in the list too

nit - name the generated column "d", to clarify this comment

~~~

9.
+-- Test the publication 'publish_generated_columns' parameter enabled
or disabled for different combinations.

nit - add another "======" separator comment before the new test
nit - some minor adjustments to whitespace and comments

~~~

10.
+-- No gencols in column list with 'publish_generated_columns'=false.
+CREATE PUBLICATION pub3 WITH (publish_generated_columns=false);
+-- ALTER PUBLICATION to add gencols to column list.
+ALTER PUBLICATION pub3 ADD TABLE gencols(a, gen1);

nit - by adding and removing collist for the existing table, we don't
need to have a 'pub3'.

======
src/test/subscription/t/031_column_list.pl

11.
It is not clear to me -- is there, or is there not yet any test case
for the multiple publication issues that were discussed previously?
e.g. when the same table has gencols but there are multiple
publications for the same subscription and the
'publish_generated_columns' parameter or column lists conflict.

======
[1] https://www.postgresql.org/message-id/CAHut%2BPt9ArtYc1Vtb1DqzEEFwjcoDHMyRCnUqkHQeFJuMCDkvQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment Content-Type Size
PS_NITPICKS_20240930_V340001.txt text/plain 9.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-09-30 06:24:04 Re: Normalize queries starting with SET for pg_stat_statements
Previous Message btsugieyuusuke 2024-09-30 06:02:03 pg_walsummary, Character-not-present-in-option