Re: Pgoutput not capturing the generated columns

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-09-11 03:24:33
Message-ID: CAHut+PuaitgE4tu3nfaR=PCQEKjB=mpDtZ1aWkbwb=JZE8YvqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are a some more review comments for patch v30-0001.

======
src/sgml/ref/create_publication.sgml

1.
+ <para>
+ If the publisher-side column is also a generated column
then this option
+ has no effect; the publisher column will be filled as normal with the
+ publisher-side computed or default data.
+ </para>

It should say "subscriber-side"; not "publisher-side". The same was
already reported by Sawada-San [1].

~~~

2.
+ <para>
+ This parameter can only be set <literal>true</literal> if
<literal>copy_data</literal> is
+ set to <literal>false</literal>.
+ </para>

IMO this limitation should be addressed by patch 0001 like it was
already done in the previous patches (e.g. v22-0002). I think
Sawada-san suggested the same [1].

Anyway, 'copy_data' is not a PUBLICATION option, so the fact it is
mentioned like this without any reference to the SUBSCRIPTION seems
like a cut/paste error from the previous implementation.

======
src/backend/catalog/pg_publication.c

3. pub_collist_validate
- if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated)
- ereport(ERROR,
- errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
- errmsg("cannot use generated column \"%s\" in publication column list",
- colname));
-

Instead of just removing this ERROR entirely here, I thought it would
be more user-friendly to give a WARNING if the PUBLICATION's explicit
column list includes generated cols when the option
"publish_generated_columns" is false. This combination doesn't seem
like something a user would do intentionally, so just silently
ignoring it (like the current patch does) is likely going to give
someone unexpected results/grief.

======
src/backend/replication/logical/proto.c

4. logicalrep_write_tuple, and logicalrep_write_attrs:

- if (att->attisdropped || att->attgenerated)
+ if (att->attisdropped)
continue;

Why aren't you also checking the new PUBLICATION option here and
skipping all gencols if the "publish_generated_columns" option is
false? Or is the BMS of pgoutput_column_list_init handling this case?
Maybe there should be an Assert for this?

======
src/backend/replication/pgoutput/pgoutput.c

5. send_relation_and_attrs

- if (att->attisdropped || att->attgenerated)
+ if (att->attisdropped)
continue;

Same question as #4.

~~~

6. prepare_all_columns_bms and pgoutput_column_list_init

+ if (att->attgenerated && !pub->pubgencolumns)
+ cols = bms_del_member(cols, i + 1);

IIUC, the algorithm seems overly tricky filling the BMS with all
columns, before straight away conditionally removing the generated
columns. Can't it be refactored to assign all the correct columns
up-front, to avoid calling bms_del_member()?

======
src/bin/pg_dump/pg_dump.c

7. getPublications

IIUC, there is lots of missing SQL code here (for all older versions)
that should be saying "false AS pubgencolumns".
e.g. compare the SQL with how "false AS pubviaroot" is used.

======
src/bin/pg_dump/t/002_pg_dump.pl

8. Missing tests?

I expected to see a pg_dump test for this new PUBLICATION option.

======
src/test/regress/sql/publication.sql

9. Missing tests?

How about adding another test case that checks this new option must be
"Boolean"?

~~~

10. Missing tests?

--- error: generated column "d" can't be in list
+-- ok: generated columns can be in the list too
ALTER PUBLICATION testpub_fortable ADD TABLE testpub_tbl5 (a, d);
+ALTER PUBLICATION testpub_fortable DROP TABLE testpub_tbl5;

(see my earlier comment #3)

IMO there should be another test case for a WARNING here if the user
attempts to include generated column 'd' in an explicit PUBLICATION
column list while the "publish_generated-columns" is false.

======
[1] https://www.postgresql.org/message-id/CAD21AoA-tdTz0G-vri8KM2TXeFU8RCDsOpBXUBCgwkfokF7%3DjA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2024-09-11 03:43:47 Allow pushdown of HAVING clauses with grouping sets
Previous Message shveta malik 2024-09-11 03:11:35 Re: Disallow altering invalidated replication slots