Re: Pgoutput not capturing the generated columns

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-10-18 12:12:08
Message-ID: CAHv8RjLviXAWtB3Kcn1A1jPpqORpkNay1y2U+55K64sqwCdrGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 17, 2024 at 12:58 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Wed, 16 Oct 2024 at 23:25, Shubham Khanna
> <khannashubham1197(at)gmail(dot)com> wrote:
> >
> > On Wed, Oct 9, 2024 at 9:08 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > On Tue, 8 Oct 2024 at 11:37, Shubham Khanna <khannashubham1197(at)gmail(dot)com> wrote:
> > > >
> > > > On Fri, Oct 4, 2024 at 9:36 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > > >
> > > > > Hi Shubham, here are my review comments for v36-0001.
> > > > >
> > > > > ======
> > > > > 1. General - merge patches
> > > > >
> > > > > It is long past due when patches 0001 and 0002 should've been merged.
> > > > > AFAIK the split was only because historically these parts had
> > > > > different authors. But, keeping them separated is not helpful anymore.
> > > > >
> > > > > ======
> > > > > src/backend/catalog/pg_publication.c
> > > > >
> > > > > 2.
> > > > > Bitmapset *
> > > > > -pub_collist_validate(Relation targetrel, List *columns)
> > > > > +pub_collist_validate(Relation targetrel, List *columns, bool pubgencols)
> > > > >
> > > > > Since you removed the WARNING, this parameter 'pubgencols' is unused
> > > > > so it should also be removed.
> > > > >
> > > > > ======
> > > > > src/backend/replication/pgoutput/pgoutput.c
> > > > >
> > > > > 3.
> > > > > /*
> > > > > - * If the publication is FOR ALL TABLES then it is treated the same as
> > > > > - * if there are no column lists (even if other publications have a
> > > > > - * list).
> > > > > + * To handle cases where the publish_generated_columns option isn't
> > > > > + * specified for all tables in a publication, we must create a column
> > > > > + * list that excludes generated columns. So, the publisher will not
> > > > > + * replicate the generated columns.
> > > > > */
> > > > > - if (!pub->alltables)
> > > > > + if (!(pub->alltables && pub->pubgencols))
> > > > >
> > > > > I still found that comment hard to understand. Does this mean to say
> > > > > something like:
> > > > >
> > > > > ------
> > > > > Process potential column lists for the following cases:
> > > > >
> > > > > a. Any publication that is not FOR ALL TABLES.
> > > > >
> > > > > b. When the publication is FOR ALL TABLES and
> > > > > 'publish_generated_columns' is false.
> > > > > A FOR ALL TABLES publication doesn't have user-defined column lists,
> > > > > so all columns will be replicated by default. However, if
> > > > > 'publish_generated_columns' is set to false, column lists must still
> > > > > be created to exclude any generated columns from being published
> > > > > ------
> > > > >
> > > > > ======
> > > > > src/test/regress/sql/publication.sql
> > > > >
> > > > > 4.
> > > > > +SET client_min_messages = 'WARNING';
> > > > > +CREATE TABLE gencols (a int, gen1 int GENERATED ALWAYS AS (a * 2) STORED);
> > > > >
> > > > > AFAIK you don't need to keep changing 'client_min_messages',
> > > > > particularly now that you've removed the WARNING message that was
> > > > > previously emitted.
> > > > >
> > > > > ~
> > > > >
> > > > > 5.
> > > > > nit - minor comment changes.
> > > > >
> > > > > ======
> > > > > Please refer to the attachment which implements any nits from above.
> > > > >
> > > >
> > > > I have fixed all the given comments. Also, I have created a new 0003
> > > > patch for the TAP-Tests related to the '011_generated.pl' file. I am
> > > > planning to merge 0001 and 0003 patches once they will get fixed.
> > > > The attached patches contain the required changes.
> > >
> > > Few comments:
> > > 1) Since we are no longer throwing an error for generated columns, the
> > > function header comments also need to be updated accordingly " Checks
> > > for and raises an ERROR for any; unknown columns, system columns,
> > > duplicate columns or generated columns."
> > > - if (TupleDescAttr(tupdesc, attnum - 1)->attgenerated)
> > > - ereport(ERROR,
> > > -
> > > errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> > > - errmsg("cannot use generated
> > > column \"%s\" in publication column list",
> > > - colname));
> > > -
> > >
> > > 2) Tab completion missing for "PUBLISH_GENERATED_COLUMNS" option in
> > > ALTER PUBLICATION ... SET (
> > > postgres=# alter publication pub2 set (PUBLISH
> > > PUBLISH PUBLISH_VIA_PARTITION_ROOT
> > >
> > > 3) I was able to compile without this include, may be this is not required:
> > > --- a/src/backend/replication/logical/tablesync.c
> > > +++ b/src/backend/replication/logical/tablesync.c
> > > @@ -118,6 +118,7 @@
> > > #include "utils/builtins.h"
> > > #include "utils/lsyscache.h"
> > > #include "utils/memutils.h"
> > > +#include "utils/rel.h"
> > >
> > > 4) You can include "\dRp+ pubname" after each of the create/alter
> > > publication to verify the columns that will be published:
> > > +-- Test the 'publish_generated_columns' parameter enabled or disabled for
> > > +-- different scenarios with/without generated columns in column lists.
> > > +CREATE TABLE gencols (a int, gen1 int GENERATED ALWAYS AS (a * 2) STORED);
> > > +
> > > +-- Generated columns in column list, when 'publish_generated_columns'=false
> > > +CREATE PUBLICATION pub1 FOR table gencols(a, gen1) WITH
> > > (publish_generated_columns=false);
> > >
> > > +-- Generated columns in column list, when 'publish_generated_columns'=true
> > > +CREATE PUBLICATION pub2 FOR table gencols(a, gen1) WITH
> > > (publish_generated_columns=true);
> > > +
> > > +-- Generated columns in column list, then set
> > > 'publication_generate_columns'=false
> > > +ALTER PUBLICATION pub2 SET (publish_generated_columns = false);
> > > +
> > > +-- Remove generate columns from column list, when
> > > 'publish_generated_columns'=false
> > > +ALTER PUBLICATION pub2 SET TABLE gencols(a);
> > > +
> > > +-- Add generated columns in column list, when 'publish_generated_columns'=false
> > > +ALTER PUBLICATION pub2 SET TABLE gencols(a, gen1);
> > >
> >
> > I have fixed all the given comments. The attached patches contain the
> > required changes.
>
> Few comments:
> 1) This change is not required:
> diff --git a/src/backend/catalog/pg_subscription.c
> b/src/backend/catalog/pg_subscription.c
> index 9efc9159f2..fcfbf86c0b 100644
> --- a/src/backend/catalog/pg_subscription.c
> +++ b/src/backend/catalog/pg_subscription.c
> @@ -551,3 +551,34 @@ GetSubscriptionRelations(Oid subid, bool not_ready)
>
> return res;
> }
> +
> +/*
> + * Add publication names from the list to a string.
> + */
> +void
> +get_publications_str(List *publications, StringInfo dest, bool quote_literal)
> +{
> + ListCell *lc;
> + bool first = true;
> +
> + Assert(publications != NIL);
> +
> + foreach(lc, publications)
> + {
> + char *pubname = strVal(lfirst(lc));
> +
> + if (first)
> + first = false;
> + else
> + appendStringInfoString(dest, ", ");
> +
> + if (quote_literal)
> + appendStringInfoString(dest,
> quote_literal_cstr(pubname));
> + else
> + {
> + appendStringInfoChar(dest, '"');
> + appendStringInfoString(dest, pubname);
> + appendStringInfoChar(dest, '"');
> + }
> + }
> +}
>
> It can be moved to subscriptioncmds.c file as earlier.
>
> 2) This line change is not required:
> * Process and validate the 'columns' list and ensure the
> columns are all
> - * valid to use for a publication. Checks for and raises
> an ERROR for
> - * any; unknown columns, system columns, duplicate
> columns or generated
> - * columns.
> + * valid to use for a publication. Checks for and raises
> an ERROR for
>
> 3) Can we store this information in LogicalRepRelation instead of
> having a local variable as column information is being stored, that
> way remotegenlist and remotegenlist_res can be removed and code will
> be more simpler:
> + if (server_version >= 180000)
> + {
> + remotegenlist[natt] =
> DatumGetBool(slot_getattr(slot, 5, &isnull));
> +
> + /*
> + * If the column is generated and neither the
> generated column
> + * option is specified nor it appears in the
> column list, we will
> + * skip it.
> + */
> + if (remotegenlist[natt] &&
> !has_pub_with_pubgencols && !included_cols)
> + {
> + ExecClearTuple(slot);
> + continue;
> + }
> + }
> +
> rel_colname = TextDatumGetCString(slot_getattr(slot,
> 2, &isnull));
> Assert(!isnull);
>
> @@ -1015,7 +1112,7 @@ fetch_remote_table_info(char *nspname, char *relname,
> ExecDropSingleTupleTableSlot(slot);
>
> lrel->natts = natt;
> -
> + *remotegenlist_res = remotegenlist;

I have fixed all the given comments. The attached v40-0001 patch
contains the required changes.

Thanks and Regards,
Shubham Khanna.

Attachment Content-Type Size
v40-0003-Tap-tests-for-generated-columns.patch application/octet-stream 12.8 KB
v40-0002-DOCS-Generated-Column-Replication.patch application/octet-stream 13.9 KB
v40-0001-Enable-support-for-publish_generated_columns-opt.patch application/octet-stream 100.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shubham Khanna 2024-10-18 12:15:22 Re: Pgoutput not capturing the generated columns
Previous Message Daniel Gustafsson 2024-10-18 12:09:43 Re: Incorrect comment on pg_shadow view