Re: Pgoutput not capturing the generated columns

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-10-22 01:02:58
Message-ID: CAHut+Pu5GW-AdLGjbEFnS3+PS=Xq9KeCn7ch9BZQdzGom2gq2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi SHubham, Here are my review comments for v40-0001 (code)

Please don't post a blanket response of "I have fixed all the
comments" response to this. Sometimes things get missed. Instead,
please reply done/not-done/whatever individually, so I can track the
changes properly.

======
src/backend/catalog/pg_publication.c

1.
- if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated)
- ereport(ERROR,
- errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
- errmsg("cannot use generated column \"%s\" in publication column list",
- colname));
-

This function no longer rejects generated columns from a publication
column list. So, now the function comment is wrong because it still
says "Checks for and raises an ERROR for any; unknown columns, system
columns, duplicate columns or generated columns."

NOTE: This was fixed already in v39-0001, but then broken again in
v40-0001 (???)

nit - also remove that semicolon (;) from the function comment.

======
src/backend/commands/publicationcmds.c

2.
- if (!isnull)
+ if (!isnull || !pubgencols)
{
int x;
Bitmapset *idattrs;
Bitmapset *columns = NULL;

- /* With REPLICA IDENTITY FULL, no column list is allowed. */
- if (relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
- result = true;
+ if (!isnull)
+ {
+ /* With REPLICA IDENTITY FULL, no column list is allowed. */
+ if (relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL)
+ result = true;
+
+ /* Transform the column list datum to a bitmapset. */
+ columns = pub_collist_to_bitmapset(NULL, datum, NULL);
+ }
+ else
+ {
+ TupleDesc desc = RelationGetDescr(relation);
+ int nliveatts = 0;
+
+ for (int i = 0; i < desc->natts; i++)
+ {
+ Form_pg_attribute att = TupleDescAttr(desc, i);
+
+ /* Skip if the attribute is dropped or generated */
+ if (att->attisdropped)
+ continue;
+
+ nliveatts++;
+
+ if (att->attgenerated)
+ continue;
+
+ columns = bms_add_member(columns, i + 1);
+ }

- /* Transform the column list datum to a bitmapset. */
- columns = pub_collist_to_bitmapset(NULL, datum, NULL);
+ /* Return if all columns of the table will be replicated */
+ if (bms_num_members(columns) == nliveatts)
+ {
+ bms_free(columns);
+ ReleaseSysCache(tuple);
+ return false;
+ }
+ }

2a.
AFAIK this code was written to deal with Sawada-San's comment about
UPDATE/DELETE [1], but the logic is not very clear. It needs more
comments on the "else" part, the generated columns and how the new
logic solves Sawada-San's problem.

~

2b.
This function is now dealing both with 'publish_via_root' as well as
'publish_generated_columns'. I am suspicious that you may be handling
each of these parameters independently (i.e.. there is some early
return "Return if all columns of the table will be replicated") but
there might be some scenarios where a *combination* of pubviaroot and
gencols is not working as expected.

For example, there is a whole comment later about this: "If pubviaroot
is true, we are validating the column list of the parent table, but
the bitmap contains the replica identity information of the child
table." which I am not sure is being dealt with correctly. IMO there
need to be more test cases added for these tricky combination
scenarios to prove the code is good.

~

2c.
I expected to see some more REPLICA IDENTITY tests to reproduce the
problem scenario that Sawada-San reported. Where are those?

======
src/backend/replication/logical/tablesync.c

3. make_copy_attnamelist

Patch v38-0001 used to have a COPY error:

+ /*
+ * Check if the subscription table generated column has same name
+ * as a non-generated column in the corresponding publication
+ * table.
+ */
+ if (!remotegenlist[remote_attnum])
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("logical replication target relation \"%s.%s\" has a
generated column \"%s\" "
+ "but corresponding column on source relation is not a generated column",
+ rel->remoterel.nspname, rel->remoterel.relname, NameStr(attr->attname))));
+

My most recent comment about this code was something along the lines
of "Make this detailed useful error message common if possible".

But, in v39 all this ERROR was removed (and it is still missing in
v40). Why? I did not see any other review comments asking for this to
be removed. AFAIK this was a correct error message for
not_generated->generated. Won't the current code now just silently
ignore/skip this, which is contrary behaviour to what normal
replication would do?

The recently deleted combo TAP tests could have detected this.

~~~

fetch_remote_table_info

4.
if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
There was a new 'server_version' variable added, so why not use it
here? In previous versions, we used to do so, but now since v39, the
code has reverted yet again. (???).

~~~

5.
appendStringInfo(&cmd,
"SELECT DISTINCT"
- " (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)"
- " THEN NULL ELSE gpt.attrs END)"
+ " (gpt.attrs)"
" FROM pg_publication p,"
" LATERAL pg_get_publication_tables(p.pubname) gpt,"
" pg_class c"

Is this change OK? This fragment is guarded only by server_version >=
150000 (not 180000) so I was unsure. Can you explain why you made this
change, and why it is correct even for versions older than 180000?

~~~

6.
lrel->natts = natt;
-
walrcv_clear_result(res);
nit - this whitespace change has nothing to do with this patch. Remove it.

~~~

copy_table

7.
+ * We also need to use this same COPY (SELECT ...) syntax when
+ * 'publish_generated_columns' is specified as true and the remote
+ * table has generated columns, because copy of generated columns is
+ * not supported by the normal COPY.

That mention about "when 'publish_generated_columns' is specified" is
not strictly true anymore, because the publication column list
overrides this parameter. It may be better to word this like below:

SUGGESTION
We also need to use this same COPY (SELECT ...) syntax when generated
columns are published, because copying generated columns is not
supported by the normal COPY.

======
src/backend/utils/cache/relcache.c

8.
/*
* Check if all columns are part of the REPLICA IDENTITY index or not.
*
- * If the publication is FOR ALL TABLES then it means the table has no
- * column list and we can skip the validation.
+ * If the publication is FOR ALL TABLES and publication includes
+ * generated columns then it means that all the table will replicate
+ * all columns and we can skip the validation.
*/

/table/tables/

Also, I think this can be re-worded to name the
'publish_generated_columns' parameter which might be more clear.

SUGGESTION:
If the publication is FOR ALL TABLES then column lists are not
possible. In this case, if 'publish_generated_columns' is true then
all table columns will be replicated, so the validation can be
skipped.

======
src/include/replication/logicalproto.h

9.
Bitmapset *attkeys; /* Bitmap of key columns */
+ bool *remotegenlist; /* Array to store whether each column is
+ * generated */
} LogicalRepRelation;

Since this is just a palloc'ed array same as the other fields like
'attnames' and 'atttyps', maybe it should be named and commented more
similarly to those? E.g more like:

bool *attremotegen; /* remote column is generated? */

======
src/test/regress/sql/publication.sql

10.
+-- Remove generate columns from column list, when
'publish_generated_columns'=false
+ALTER PUBLICATION pub2 SET TABLE gencols(a);
+\dRp+ pub2

typo /generate/generated/

======
src/test/subscription/t/100_bugs.pl

11.
- CREATE TABLE generated_cols (a int, b_gen int GENERATED ALWAYS AS (5
* a) STORED, c int);
+ CREATE TABLE generated_cols (a int, b_gen int, c int);

Hmm. But, should we be modifying tests in the "bugs" test file for a
new feature? If it is really necessary, then at least update the
comment to explain why.

~~~

12.
- CREATE TABLE generated_cols (a int, b_gen int GENERATED ALWAYS AS (5
* a) STORED, c int);
+ CREATE TABLE generated_cols (a int, b_gen int, c int);

Now you've ended up with a table called 'generated_cols' which has no
generated cols in it at all. Isn't this contrary to why this test case
existed in the first place? In other words, it feels a bit like a
hack. Perhaps you could have left all the tables as-is but just say
'publish_generated_columns=false'. But then that should be the default
parameter value anyhow, so TBH it is not clear to me why 100_bugs.pl
needed any changes at all. Can you give the reason?

======

PSA the nitpicks attachment which implements some of the cosmetic
suggestions from above.

======

[1] https://www.postgresql.org/message-id/CAD21AoB%3DDBVDNCGBja%2BsDa2-w9tsM7_E%3DZgyw2qYMR1R0FwDsg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia.

Attachment Content-Type Size
PS_NITPICKS_20241022_GENCOLS_V400001.txt text/plain 4.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-10-22 01:32:45 Re: EXPLAIN IndexOnlyScan shows disabled when enable_indexonlyscan=on
Previous Message Shinoda, Noriyoshi (SXD Japan FSIP) 2024-10-22 00:50:56 RE: Statistics Import and Export