Re: Pgoutput not capturing the generated columns

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Shubham Khanna <khannashubham1197(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-10-29 09:35:57
Message-ID: CALDaNm0wSqLo25aWgrfRLO1p8dw=40i7Ysj0ayeuLjLvWP3oOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 29 Oct 2024 at 07:44, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are my review comments for v44-0001.
>
> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 1.
> - When a column list is specified, only the named columns are replicated.
> + When a column list is specified, all columns (except generated columns)
> + of the table are replicated.
> If no column list is specified, all columns of the table are replicated
> through this publication, including any columns added later. It has no
>
> Huh? This seems very wrong.
>
> I think it should have been like:
> When a column list is specified, only the named columns are
> replicated. If no column list is specified, all table columns (except
> generated columns) are replicated...

Modified

> ======
> src/backend/replication/logical/proto.c
>
> 2.
> +bool
> +logicalrep_should_publish_column(Form_pg_attribute att, Bitmapset *columns)
> +{
> + if (att->attisdropped)
> + return false;
> +
> + /*
> + * Skip publishing generated columns if they are not included in the
> + * column list.
> + */
> + if (!columns && att->attgenerated)
> + return false;
> +
> + /*
> + * Check if a column is covered by a column list.
> + */
> + if (columns && !bms_is_member(att->attnum, columns))
> + return false;
> +
> + return true;
> +}
>
> I thought this could be more simply written as:
>
> {
> if (att->attisdropped)
> return false;
>
> /* If a column list was specified only publish the specified columns. */
> if (columns)
> return bms_is_member(att->attnum, columns);
>
> /* If a column list was not specified publish everything except
> generated columns. */
> return !att->attgenerated;
> }

I preferred the earlier code as it is more simple, added a few
comments for the same to avoid confusion.

> ======
> src/backend/replication/pgoutput/pgoutput.c
>
> 3.
> - if (att->attisdropped || att->attgenerated)
> + if (att->attisdropped)
> + continue;
> +
> + if (att->attgenerated)
> + {
> + if (bms_is_member(att->attnum, cols))
> + gencolpresent = true;
> +
> continue;
> + }
> +
>
> nliveatts++;
> }
>
> /*
> - * If column list includes all the columns of the table,
> - * set it to NULL.
> + * If column list includes all the columns of the table
> + * and there are no generated columns, set it to NULL.
> */
> - if (bms_num_members(cols) == nliveatts)
> + if (bms_num_members(cols) == nliveatts && !gencolpresent)
> {
> bms_free(cols);
> cols = NULL;
> ~
>
> That code still looks strange to me. I think that unconditional
> 'continue' for attgenerated is breaking the meaning of 'nliveattrs'
> (which I take as meaning 'count-of-the-attrs-to-be-published').
>
> AFAICT the code should be more like this:
>
> if (att->attgenerated)
> {
> /* Generated cols are skipped unless they are present in a column list. */
> if (!bms_is_member(att->attnum, cols))
> continue;
>
> gencolpresent = true;
> }

Modified

> ======
> src/test/regress/sql/publication.sql
>
> 4.
> ALTER PUBLICATION testpub_fortable DROP TABLE testpub_tbl5;
>
> +-- ok: generated column "d" can be in the list too
> +ALTER PUBLICATION testpub_fortable ADD TABLE testpub_tbl5 (d);
> +ALTER PUBLICATION testpub_fortable DROP TABLE testpub_tbl5;
>
> Maybe you can change this test to do "SET TABLE testpub_tbl5 (a,d);"
> instead of ADD TABLE, so then you can remove the earlier DROP and DROP
> the table only once.

I did not make this change as Amit also felt that way, added column a
also mentioned in [1].

> ======
> src/test/subscription/t/031_column_list.pl
>
> 5.
> +# TEST: Dropped columns are not considered for the column list, and generated
> +# columns are not replicated if they are not explicitly included in the column
> +# list. So, the publication having a column list except for those columns and a
> +# publication without any column (aka all columns as part of the columns list)
> +# are considered to have the same column list.
>
> Hmm. I don't think this wording is quite right "without any column".
> AFAIK the original intent of this test was to prove only that
> dropped/generated columns were ignored for the NULL column list logic.
>
> That last sentence maybe should say more like:
>
> So a publication with a column list specifying all table columns
> (excluding only dropped and generated columns) is considered to be the
> same as a publication that has no column list at all for that table.

I have just changed "publication without any column" to "publication
without any column list" as the rest looks ok to me.

The attached v45 version patch has the changes for the same. I have
also merged the 0002 patch as the patch looks fairly stable now.

[1] - https://www.postgresql.org/message-id/CAA4eK1K31%3D1draCJE0ng3Drt8C9D65qPppwK%3D-V64YMiDyRziA%40mail.gmail.com

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2024-10-29 09:39:03 Re: define pg_structiszero(addr, s, r)
Previous Message Peter Eisentraut 2024-10-29 09:34:09 Re: Make all Perl warnings fatal