Re: Pgoutput not capturing the generated columns

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-10-09 06:22:15
Message-ID: CALDaNm30nY54RNZOHV_wxyvt0NHgW58ckNr47C=eqfXRV4YWQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 8 Oct 2024 at 11:37, Shubham Khanna <khannashubham1197(at)gmail(dot)com> wrote:
>
> On Fri, Oct 4, 2024 at 9:36 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Shubham, here are my review comments for v36-0001.
> >
> > ======
> > 1. General - merge patches
> >
> > It is long past due when patches 0001 and 0002 should've been merged.
> > AFAIK the split was only because historically these parts had
> > different authors. But, keeping them separated is not helpful anymore.
> >
> > ======
> > src/backend/catalog/pg_publication.c
> >
> > 2.
> > Bitmapset *
> > -pub_collist_validate(Relation targetrel, List *columns)
> > +pub_collist_validate(Relation targetrel, List *columns, bool pubgencols)
> >
> > Since you removed the WARNING, this parameter 'pubgencols' is unused
> > so it should also be removed.
> >
> > ======
> > src/backend/replication/pgoutput/pgoutput.c
> >
> > 3.
> > /*
> > - * If the publication is FOR ALL TABLES then it is treated the same as
> > - * if there are no column lists (even if other publications have a
> > - * list).
> > + * To handle cases where the publish_generated_columns option isn't
> > + * specified for all tables in a publication, we must create a column
> > + * list that excludes generated columns. So, the publisher will not
> > + * replicate the generated columns.
> > */
> > - if (!pub->alltables)
> > + if (!(pub->alltables && pub->pubgencols))
> >
> > I still found that comment hard to understand. Does this mean to say
> > something like:
> >
> > ------
> > Process potential column lists for the following cases:
> >
> > a. Any publication that is not FOR ALL TABLES.
> >
> > b. When the publication is FOR ALL TABLES and
> > 'publish_generated_columns' is false.
> > A FOR ALL TABLES publication doesn't have user-defined column lists,
> > so all columns will be replicated by default. However, if
> > 'publish_generated_columns' is set to false, column lists must still
> > be created to exclude any generated columns from being published
> > ------
> >
> > ======
> > src/test/regress/sql/publication.sql
> >
> > 4.
> > +SET client_min_messages = 'WARNING';
> > +CREATE TABLE gencols (a int, gen1 int GENERATED ALWAYS AS (a * 2) STORED);
> >
> > AFAIK you don't need to keep changing 'client_min_messages',
> > particularly now that you've removed the WARNING message that was
> > previously emitted.
> >
> > ~
> >
> > 5.
> > nit - minor comment changes.
> >
> > ======
> > Please refer to the attachment which implements any nits from above.
> >
>
> I have fixed all the given comments. Also, I have created a new 0003
> patch for the TAP-Tests related to the '011_generated.pl' file. I am
> planning to merge 0001 and 0003 patches once they will get fixed.
> The attached patches contain the required changes.

Few comments:
1) I felt this change need not be part of this patch, if required it
can be proposed as a separate patch:
+ if (server_version >= 150000)
{
WalRcvExecResult *pubres;
TupleTableSlot *tslot;
Oid attrsRow[] = {INT2VECTOROID};
- StringInfoData pub_names;
-
- initStringInfo(&pub_names);
- foreach(lc, MySubscription->publications)
- {
- if (foreach_current_index(lc) > 0)
- appendStringInfoString(&pub_names, ", ");
- appendStringInfoString(&pub_names,
quote_literal_cstr(strVal(lfirst(lc))));
- }
+ StringInfo pub_names = makeStringInfo();

2) These two statements can be combined in to single appendStringInfo:
+ appendStringInfo(&cmd,
" FROM pg_catalog.pg_attribute a"
" LEFT JOIN pg_catalog.pg_index i"
" ON (i.indexrelid =
pg_get_replica_identity_index(%u))"
" WHERE a.attnum > 0::pg_catalog.int2"
- " AND NOT a.attisdropped %s"
+ " AND NOT a.attisdropped",
lrel->remoteid);
+
+ appendStringInfo(&cmd,
" AND a.attrelid = %u"
" ORDER BY a.attnum",
- lrel->remoteid,
-
(walrcv_server_version(LogRepWorkerWalRcvConn) >= 120000 ?
- "AND a.attgenerated = ''" : ""),
lrel->remoteid);

3) In which scenario this will be hit:
+ /*
+ * Construct column list for COPY, excluding columns that are
subscription
+ * table generated columns.
+ */
+ for (int i = 0; i < rel->remoterel.natts; i++)
+ {
+ if (!localgenlist[i])
+ attnamelist = lappend(attnamelist,
+
makeString(rel->remoterel.attnames[i]));
+ }

As in case of publisher having non generated columns:
CREATE TABLE t1(c1 int, c2 int)
and subscriber having generated columns:
CREATE TABLE t1(c1 int, c2 int GENERATED ALWAYS AS (c1 * 2) STORED)

We throw an error much earlier at
logicalrep_rel_open->logicalrep_report_missing_attrs saying:
ERROR: logical replication target relation "public.t1" is missing
replicated column: "c2"

4) To simplify the code and reduce complexity, we can refactor the
error checks to be included within the fetch_remote_table_info
function. This way, the remotegenlist will not need to be prepared and
passed to make_copy_attnamelist:
+ /*
+ * This loop checks for generated columns of the subscription table.
+ */
+ for (int i = 0; i < desc->natts; i++)
{
- attnamelist = lappend(attnamelist,
-
makeString(rel->remoterel.attnames[i]));
+ int remote_attnum;
+ Form_pg_attribute attr = TupleDescAttr(desc, i);
+
+ if (!attr->attgenerated)
+ continue;
+
+ remote_attnum = logicalrep_rel_att_by_name(&rel->remoterel,
+
NameStr(attr->attname));
+
+ if (remote_attnum >= 0)
+ {
+ /*
+ * Check if the subscription table generated
column has same name
+ * as a non-generated column in the
corresponding publication
+ * table.
+ */
+ if (!remotegenlist[remote_attnum])
+ ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("logical
replication target relation \"%s.%s\" has a generated column \"%s\" "
+ "but
corresponding column on source relation is not a generated column",
+
rel->remoterel.nspname, rel->remoterel.relname,
NameStr(attr->attname))));
+
+ /*
+ * 'localgenlist' records that this is a
generated column in the
+ * subscription table. Later, we use this
information to skip
+ * adding this column to the column list for COPY.
+ */
+ localgenlist[remote_attnum] = true;
+ }
}

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2024-10-09 06:33:47 Re: Conflict Detection and Resolution
Previous Message Yugo Nagata 2024-10-09 06:15:08 Re: Set AUTOCOMMIT to on in script output by pg_dump