Re: row filtering for logical replication

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-16 04:40:56
Message-ID: CAFiTN-tLvpxkGMV9Vutv1bwV7-DpgRJaXL4b8h9Tjj9Aho4QXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 16, 2021 at 8:57 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
> > <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> > >
> > > Is there some reasonable rule which of the old/new tuples (or both) to
> > > use for the WHERE condition? Or maybe it'd be handy to allow referencing
> > > OLD/NEW as in triggers?
> >
> > I think for insert we are only allowing those rows to replicate which
> > are matching filter conditions, so if we updating any row then also we
> > should maintain that sanity right? That means at least on the NEW rows
> > we should apply the filter, IMHO. Said that, now if there is any row
> > inserted which were satisfying the filter and replicated, if we update
> > it with the new value which is not satisfying the filter then it will
> > not be replicated, I think that makes sense because if an insert is
> > not sending any row to a replica which is not satisfying the filter
> > then why update has to do that, right?
> >
>
> There is another theory in this regard which is what if the old row
> (created by the previous insert) is not sent to the subscriber as that
> didn't match the filter but after the update, we decide to send it
> because the updated row (new row) matches the filter condition. In
> this case, I think it will generate an update conflict on the
> subscriber as the old row won't be present. As of now, we just skip
> the update but in the future, we might have some conflict handling
> there. If this is true then even if the new row matches the filter,
> there is no guarantee that it will be applied on the subscriber-side
> unless the old row also matches the filter.

Yeah, it's a valid point.

Sure, there could be a
> case where the user might have changed the filter between insert and
> update but maybe we can have a separate way to deal with such cases if
> required like providing some provision where the user can specify
> whether it would like to match old/new row in updates?

Yeah, I think the best way is that users should get an option whether
they want to apply the filter on the old row or on the new row, or
both, in fact, they should be able to apply the different filters on
old and new rows. I have one more thought in mind: currently, we are
providing a filter for the publication table, doesn't it make sense to
provide filters for operations of the publication table? I mean the
different filters for Insert, delete, and the old row of update and
the new row of the update.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-07-16 05:08:57 Re: Introduce pg_receivewal gzip compression tests
Previous Message r.takahashi_2@fujitsu.com 2021-07-16 04:04:33 RE: Speed up COMMIT PREPARED