Re: Pgoutput not capturing the generated columns

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Cc: Shubham Khanna <khannashubham1197(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-06-04 09:30:49
Message-ID: CAHut+Pto4SJgDPMTwC7XfhH30=3yeUrFXNZpfKQMJSSgL0zVDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Here are some review comments for patch v5-0003.

======
0. Whitespace warnings when the patch was applied.

[postgres(at)CentOS7-x64 oss_postgres_misc]$ git apply
../patches_misc/v5-0003-Support-copy-of-generated-columns-during-tablesyn.patch
../patches_misc/v5-0003-Support-copy-of-generated-columns-during-tablesyn.patch:29:
trailing whitespace.
has no effect; the replicated data will be ignored and the subscriber
../patches_misc/v5-0003-Support-copy-of-generated-columns-during-tablesyn.patch:30:
trailing whitespace.
column will be filled as normal with the subscriber-side computed or
../patches_misc/v5-0003-Support-copy-of-generated-columns-during-tablesyn.patch:189:
trailing whitespace.
(walrcv_server_version(LogRepWorkerWalRcvConn) >= 120000 &&
warning: 3 lines add whitespace errors.

======
src/backend/commands/subscriptioncmds.c

1.
- res = walrcv_exec(wrconn, cmd.data, check_columnlist ? 3 : 2, tableRow);
+ column_count = (!include_generated_column && check_gen_col) ? 4 :
(check_columnlist ? 3 : 2);
+ res = walrcv_exec(wrconn, cmd.data, column_count, tableRow);

The 'column_count' seems out of control. Won't it be far simpler to
assign/increment the value dynamically only as required instead of the
tricky calculation at the end which is unnecessarily difficult to
understand?

~~~

2.
+ /*
+ * If include_generated_column option is false and all the column of
the table in the
+ * publication are generated then we should throw an error.
+ */
+ if (!isnull && !include_generated_column && check_gen_col)
+ {
+ attlist = DatumGetArrayTypeP(attlistdatum);
+ gen_col_count = DatumGetInt32(slot_getattr(slot, 4, &isnull));
+ Assert(!isnull);
+
+ attcount = ArrayGetNItems(ARR_NDIM(attlist), ARR_DIMS(attlist));
+
+ if (attcount != 0 && attcount == gen_col_count)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot use only generated column for table \"%s.%s\" in
publication when generated_column option is false",
+ nspname, relname));
+ }
+

Why do you think this new logic/error is necessary?

IIUC the 'include_generated_columns' should be false to match the
existing HEAD behavior. So this scenario where your publisher-side
table *only* has generated columns is something that could already
happen, right? IOW, this introduced error could be a candidate for
another discussion/thread/patch, but is it really required for this
current patch?

======
src/backend/replication/logical/tablesync.c

3.
lrel->remoteid,
- (walrcv_server_version(LogRepWorkerWalRcvConn) >= 120000 ?
- "AND a.attgenerated = ''" : ""),
+ (walrcv_server_version(LogRepWorkerWalRcvConn) >= 120000 &&
+ (walrcv_server_version(LogRepWorkerWalRcvConn) <= 160000 ||
+ !MySubscription->includegeneratedcolumn) ? "AND a.attgenerated = ''" : ""),

This ternary within one big appendStringInfo seems quite complicated.
Won't it be better to split the appendStringInfo into multiple parts
so the generated-cols calculation can be done more simply?

======
src/test/subscription/t/011_generated.pl

4.
I think there should be a variety of different tablesync scenarios
(when 'include_generated_columns' is true) tested here instead of just
one, and all varieties with lots of comments to say what they are
doing, expectations etc.

a. publisher-side gen-col "a" replicating to subscriber-side NOT
gen-col "a" (ok, value gets replicated)
b. publisher-side gen-col "a" replicating to subscriber-side gen-col
(ok, but ignored)
c. publisher-side NOT gen-col "b" replicating to subscriber-side
gen-col "b" (error?)

~~

5.
+$result = $node_subscriber->safe_psql('postgres', "SELECT a, b FROM tab3");
+is( $result, qq(1|2
+2|4
+3|6), 'generated columns initial sync with include_generated_column = true');

Should this say "ORDER BY..." so it will not fail if the row order
happens to be something unanticipated?

======

99.
Also, see the attached file with numerous other nitpicks:
- plural param- and var-names
- typos in comments
- missing spaces
- SQL keyword should be UPPERCASE
- etc.

Please apply any/all of these if you agree with them.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment Content-Type Size
PS_20240604_v50003_nitpicks.txt text/plain 3.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2024-06-04 10:00:00 Re: Recent 027_streaming_regress.pl hangs
Previous Message Dean Rasheed 2024-06-04 08:39:05 Re: Minor fixes for couple some comments around MERGE RETURNING