Re: Pgoutput not capturing the generated columns

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-10-29 02:14:17
Message-ID: CAHut+PsZDBVR9uy36wOF-uQvB+aYpkxQP-9_aPtkpGiOzbp7QQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are my review comments for v44-0001.

======
doc/src/sgml/ref/create_publication.sgml

1.
- When a column list is specified, only the named columns are replicated.
+ When a column list is specified, all columns (except generated columns)
+ of the table are replicated.
If no column list is specified, all columns of the table are replicated
through this publication, including any columns added later. It has no

Huh? This seems very wrong.

I think it should have been like:
When a column list is specified, only the named columns are
replicated. If no column list is specified, all table columns (except
generated columns) are replicated...

======
src/backend/replication/logical/proto.c

2.
+bool
+logicalrep_should_publish_column(Form_pg_attribute att, Bitmapset *columns)
+{
+ if (att->attisdropped)
+ return false;
+
+ /*
+ * Skip publishing generated columns if they are not included in the
+ * column list.
+ */
+ if (!columns && att->attgenerated)
+ return false;
+
+ /*
+ * Check if a column is covered by a column list.
+ */
+ if (columns && !bms_is_member(att->attnum, columns))
+ return false;
+
+ return true;
+}

I thought this could be more simply written as:

{
if (att->attisdropped)
return false;

/* If a column list was specified only publish the specified columns. */
if (columns)
return bms_is_member(att->attnum, columns);

/* If a column list was not specified publish everything except
generated columns. */
return !att->attgenerated;
}

======
src/backend/replication/pgoutput/pgoutput.c

3.
- if (att->attisdropped || att->attgenerated)
+ if (att->attisdropped)
+ continue;
+
+ if (att->attgenerated)
+ {
+ if (bms_is_member(att->attnum, cols))
+ gencolpresent = true;
+
continue;
+ }
+

nliveatts++;
}

/*
- * If column list includes all the columns of the table,
- * set it to NULL.
+ * If column list includes all the columns of the table
+ * and there are no generated columns, set it to NULL.
*/
- if (bms_num_members(cols) == nliveatts)
+ if (bms_num_members(cols) == nliveatts && !gencolpresent)
{
bms_free(cols);
cols = NULL;
~

That code still looks strange to me. I think that unconditional
'continue' for attgenerated is breaking the meaning of 'nliveattrs'
(which I take as meaning 'count-of-the-attrs-to-be-published').

AFAICT the code should be more like this:

if (att->attgenerated)
{
/* Generated cols are skipped unless they are present in a column list. */
if (!bms_is_member(att->attnum, cols))
continue;

gencolpresent = true;
}

======
src/test/regress/sql/publication.sql

4.
ALTER PUBLICATION testpub_fortable DROP TABLE testpub_tbl5;

+-- ok: generated column "d" can be in the list too
+ALTER PUBLICATION testpub_fortable ADD TABLE testpub_tbl5 (d);
+ALTER PUBLICATION testpub_fortable DROP TABLE testpub_tbl5;

Maybe you can change this test to do "SET TABLE testpub_tbl5 (a,d);"
instead of ADD TABLE, so then you can remove the earlier DROP and DROP
the table only once.

======
src/test/subscription/t/031_column_list.pl

5.
+# TEST: Dropped columns are not considered for the column list, and generated
+# columns are not replicated if they are not explicitly included in the column
+# list. So, the publication having a column list except for those columns and a
+# publication without any column (aka all columns as part of the columns list)
+# are considered to have the same column list.

Hmm. I don't think this wording is quite right "without any column".
AFAIK the original intent of this test was to prove only that
dropped/generated columns were ignored for the NULL column list logic.

That last sentence maybe should say more like:

So a publication with a column list specifying all table columns
(excluding only dropped and generated columns) is considered to be the
same as a publication that has no column list at all for that table.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2024-10-29 02:47:27 Re: Consider the number of columns in the sort cost model
Previous Message Jeff Davis 2024-10-29 01:55:00 Re: Statistics Import and Export