From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: row filtering for logical replication |
Date: | 2021-12-17 00:47:28 |
Message-ID: | CAHut+PuU2vjZq5Te=m83wMfP03ubC0vg2LbHYWNhEJn3WHOcGw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 14, 2021 at 4:20 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Dec 14, 2021 at 4:44 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > On Tue, Dec 7, 2021 at 5:48 PM tanghy(dot)fnst(at)fujitsu(dot)com
> > <tanghy(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
> > > treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?
> > >
> > > But for now, the filter(a > 10) works both when copying initial data and later changes.
> > >
> > > To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
> > > publication or published as part of schema in function pgoutput_row_filter_init
> > > (which was introduced in v44-0003 patch), also we need to make some changes in
> > > tablesync.c.
> > >
> >
> > Partly fixed in v46-0005 [1]
> >
> > NOTE
> > - The initial COPY part of the tablesync does not take the publish
> > operation into account so it means that if any of the subscribed
> > publications have "puballtables" flag then all data will be copied
> > sans filters.
> >
>
> I think this should be okay but the way you have implemented it in the
> patch doesn't appear to be the optimal way. Can't we fetch
> allpubtables info and qual info as part of one query instead of using
> separate queries?
Fixed in v47 [1]. Now code uses a unified SQL query provided by Vignesh.
>
> > I guess this is consistent with the other decision to
> > ignore publication operations [2].
> >
> > TODO
> > - Documentation
> > - IIUC there is a similar case yet to be addressed - FOR ALL TABLES IN SCHEMA
> >
>
> Yeah, "FOR ALL TABLES IN SCHEMA" should also be addressed. In this
> case, the difference would be that we need to check the presence of
> schema corresponding to the table (for which we are fetching
> row_filter information) is there in pg_publication_namespace. If it
> exists then we don't need to apply row_filter for the table. I feel it
> is better to fetch all this information as part of the query which you
> are using to fetch row_filter info. The idea is to avoid the extra
> round-trip between subscriber and publisher.
>
Fixed in v47 [1]. Added code and TAP test case for ALL TABLES IN SCHEMA.
> Few other comments:
> ===================
> 1.
> @@ -926,6 +928,22 @@ pgoutput_row_filter_init(PGOutputData *data,
> Relation relation, RelationSyncEntr
> bool rfisnull;
>
> /*
> + * If the publication is FOR ALL TABLES then it is treated same as if this
> + * table has no filters (even if for some other publication it does).
> + */
> + if (pub->alltables)
> + {
> + if (pub->pubactions.pubinsert)
> + no_filter[idx_ins] = true;
> + if (pub->pubactions.pubupdate)
> + no_filter[idx_upd] = true;
> + if (pub->pubactions.pubdelete)
> + no_filter[idx_del] = true;
> +
> + continue;
> + }
>
> Is there a reason to continue checking the other publications if
> no_filter is true for all kind of pubactions?
>
Fixed in v47 [1].
> 2.
> + * All row filter expressions will be discarded if there is one
> + * publication-relation entry without a row filter. That's because
> + * all expressions are aggregated by the OR operator. The row
> + * filter absence means replicate all rows so a single valid
> + * expression means publish this row.
>
> This same comment is at two places, remove from one of the places. I
> think keeping it atop for loop is better.
>
Fixed in v47 [1]
> 3.
> + {
> + int idx;
> + bool found_filters = false;
>
> I am not sure if starting such ad-hoc braces in the code to localize
> the scope of variables is a regular practice. Can we please remove
> this?
>
Fixed in v47 [1]
Kind Regards,
Peter Smith.
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2021-12-17 00:49:38 | Re: row filtering for logical replication |
Previous Message | Yugo NAGATA | 2021-12-17 00:47:18 | Allow DELETE to use ORDER BY and LIMIT/OFFSET |