RE: row filtering for logical replication

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: row filtering for logical replication
Date: 2022-01-21 05:21:58
Message-ID: OS0PR01MB571645EA968376063E63BB60945B9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thur, Jan 20, 2022 10:26 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2022-Jan-20, Amit Kapila wrote:
>
> > It returns an invalid column referenced in an RF if any but if not
> > then it helps to form pubactions which is anyway required at a later
> > point in the caller. The idea is that when we are already traversing
> > publications we should store/gather as much info as possible.
>
> I think this design isn't quite awesome.
>
> > I think probably the API name is misleading, maybe we should name it
> > something like ValidateAndFetchPubInfo, ValidateAndRememberPubInfo, or
> > something along these lines?
>
> Maybe RelationBuildReplicationPublicationDesc or just
> RelationBuildPublicationDesc are good names for a routine that fill in
> the publication aspect of the relcache entry, as a parallel to
> RelationBuildPartitionDesc.
>
> > > Maybe this was meant to be "validate RF
> > > expressions" and return, perhaps, a bitmapset of all invalid columns
> > > referenced?
> >
> > Currently, we stop as soon as we find the first invalid column.
>
> That seems quite strange. (And above you say "gather as much info as
> possible", so why stop at the first one?)
>
> > > (What is an invalid column in the first place?)
> >
> > A column that is referenced in the row filter but is not part of
> > Replica Identity.
>
> I do wonder how do these invalid columns reach the table definition in
> the first place. Shouldn't these be detected at DDL time and prohibited
> from getting into the definition?

Personally, I'm a little hesitant to put the check at DDL level, because
adding check at DDLs like ATTACH PARTITION/CREATE PARTITION OF ( [1]
explained why we need to check these DDLs) looks a bit restrictive and
user might also complain about that. Put the check in
CheckCmdReplicaIdentity seems more acceptable because it is consistent
with the existing behavior which has few complaints from users AFAIK.

[1] https://www.postgresql.org/message-id/CAA4eK1%2Bm45Xyzx7AUY9TyFnB6CZ7_%2B_uooPb7WHSpp7UE%3DYmKg%40mail.gmail.com

Best regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-01-21 05:29:45 Re: Skipping logical replication transactions on subscriber side
Previous Message osumi.takamichi@fujitsu.com 2022-01-21 05:02:45 RE: Skipping logical replication transactions on subscriber side