Re: Pgoutput not capturing the generated columns

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-08-22 11:58:21
Message-ID: CALDaNm31LZQfeR8Vv1qNCOREGffvZbgGDrTp=3h=EHiHTEO2pQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 22 Aug 2024 at 10:22, Shubham Khanna
<khannashubham1197(at)gmail(dot)com> wrote:
>
> On Fri, Aug 16, 2024 at 2:47 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Fri, 16 Aug 2024 at 10:04, Shubham Khanna
> > <khannashubham1197(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Aug 8, 2024 at 12:43 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > >
> > > > Hi Shubham,
> > > >
> > > > I think the v25-0001 patch only half-fixes the problems reported in my
> > > > v24-0001 review.
> > > >
> > > > ~
> > > >
> > > > Background (from the commit message):
> > > > This commit enables support for the 'include_generated_columns' option
> > > > in logical replication, allowing the transmission of generated column
> > > > information and data alongside regular table changes.
> > > >
> > > > ~
> > > >
> > > > The broken TAP test scenario in question is replicating from a
> > > > "not-generated" column to a "generated" column. As the generated
> > > > column is not on the publishing side, IMO the
> > > > 'include_generated_columns' option should have zero effect here.
> > > >
> > > > In other words, I expect this TAP test for 'include_generated_columns
> > > > = true' case should also be failing, as I wrote already yesterday:
> > > >
> > > > +# FIXME
> > > > +# Since there is no generated column on the publishing side this should give
> > > > +# the same result as the previous test. -- e.g. something like:
> > > > +# ERROR: logical replication target relation
> > > > "public.tab_nogen_to_gen" is missing
> > > > +# replicated column: "b"
> > >
> > > I have fixed the given comments. The attached v26-0001 Patch contains
> > > the required changes.
> >
> > Few comments:
> > 1) There's no need to pass include_generated_columns in this case; we
> > can retrieve it from ctx->data instead:
> > @@ -749,7 +764,7 @@ maybe_send_schema(LogicalDecodingContext *ctx,
> > static void
> > send_relation_and_attrs(Relation relation, TransactionId xid,
> > LogicalDecodingContext *ctx,
> > - Bitmapset *columns)
> > + Bitmapset *columns,
> > bool include_generated_columns)
> > {
> > TupleDesc desc = RelationGetDescr(relation);
> > int i;
> > @@ -766,7 +781,10 @@ send_relation_and_attrs(Relation relation,
> > TransactionId xid,
> >
> > 2) Commit message:
> > If the subscriber-side column is also a generated column then this option
> > has no effect; the replicated data will be ignored and the subscriber
> > column will be filled as normal with the subscriber-side computed or
> > default data.
> >
> > An error will occur in this case, so the message should be updated accordingly.
> >
> > 3) The current test is structured as follows: a) Create all required
> > tables b) Insert data into tables c) Create publications d) Create
> > subscriptions e) Perform inserts and verify
> > This approach can make reviewing and maintenance somewhat challenging.
> >
> > Instead, could you modify it to: a) Create the required table for a
> > single test b) Insert data for this test c) Create the publication for
> > this test d) Create the subscriptions for this test e) Perform inserts
> > and verify f) Clean up
> >
> > 4) We can maintain the test as a separate 0002 patch, as it may need a
> > few rounds of review and final adjustments. Once it's fully completed,
> > we can merge it back in.
> >
> > 5) Once we create and drop publication/subscriptions for individual
> > tests, we won't need such extensive configuration; we should be able
> > to run them with default values:
> > +$node_publisher->append_conf(
> > + 'postgresql.conf',
> > + "max_wal_senders = 20
> > + max_replication_slots = 20");
>
> Fixed all the given comments. The attached patches contain the
> suggested changes.

Few comments:
1) This is already been covered in the first existing test case, may
be this can be removed:
# =============================================================================
# Testcase start: Subscriber table with a generated column (b) on the
# subscriber, where column (b) is not present on the publisher.

This existing test:
$node_publisher->safe_psql(
'postgres', qq(
CREATE TABLE tab1 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 2) STORED);
INSERT INTO tab1 (a) VALUES (1), (2), (3);
CREATE PUBLICATION pub1 FOR ALL TABLES;
));

$node_subscriber->safe_psql(
'postgres', qq(
CREATE TABLE tab1 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a *
22) STORED, c int);
CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1;
));

2) Can we have this test verified with include_generated_columns =
true too like how others are done:
my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';

$node_publisher->safe_psql(
'postgres', qq(
CREATE TABLE tab1 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 2) STORED);
INSERT INTO tab1 (a) VALUES (1), (2), (3);
CREATE PUBLICATION pub1 FOR ALL TABLES;
));

$node_subscriber->safe_psql(
'postgres', qq(
CREATE TABLE tab1 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a *
22) STORED, c int);
CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1;
));

3) There is a typo in this comment:
3.a) # Testcase start: Publisher table with a generated column (b)
and subscriber
# table a with regular column (b).

It should be:
# Testcase start: Publisher table with a generated column (b) and subscriber
# table with a regular column (b).

3.b) similarly here too:
# Testcase end: Publisher table with a generated column (b) and subscriber
# table a with regular column (b).

3.c) The comments are not consistent, sometimes mentioned as
column(b) and sometimes as column (b). We can keep it consistent.

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2024-08-22 12:02:42 Re: type cache cleanup improvements
Previous Message Peter Eisentraut 2024-08-22 11:32:43 Re: Redundant Result node