Re: row filtering for logical replication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-12-22 07:43:39
Message-ID: CAHut+Pv83nOf7YAjTBVDDZUwisZJd0Vubg1svS92zD+6py3D-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 21, 2021 at 9:03 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
...
> Few comments:
> 1) list_free(schemarelids) is called inside if and and outside if in
> break case. Can we move it above continue so that it does not gets
> called in the break case:
> + schemarelids =
> GetAllSchemaPublicationRelations(pub->oid,
> +
> pub->pubviaroot ?
> +
> PUBLICATION_PART_ROOT
> :
> +
>
> PUBLICATION_PART_LEAF);
> + if (list_member_oid(schemarelids, entry->relid))
> + {
> + if (pub->pubactions.pubinsert)
> + no_filter[idx_ins] = true;
> + if (pub->pubactions.pubupdate)
> + no_filter[idx_upd] = true;
> + if (pub->pubactions.pubdelete)
> + no_filter[idx_del] = true;
> +
> + list_free(schemarelids);
> +
> + /* Quick exit loop if all pubactions
> have no row-filter. */
> + if (no_filter[idx_ins] &&
> no_filter[idx_upd] && no_filter[idx_del])
> + break;
> +
> + continue;
> + }
> + list_free(schemarelids);
>

I think this review comment is mistaken. The break will break from the
loop, so the free is still needed. So I skipped this comment. If you
still think there is a problem please give a more detailed explanation
about it.

> 2) create_edata_for_relation also is using similar logic, can it also
> call this function to reduce duplicate code
> +static EState *
> +create_estate_for_relation(Relation rel)
> +{
> + EState *estate;
> + RangeTblEntry *rte;
> +
> + estate = CreateExecutorState();
> +
> + rte = makeNode(RangeTblEntry);
> + rte->rtekind = RTE_RELATION;
> + rte->relid = RelationGetRelid(rel);
> + rte->relkind = rel->rd_rel->relkind;
> + rte->rellockmode = AccessShareLock;
> + ExecInitRangeTable(estate, list_make1(rte));
> +
> + estate->es_output_cid = GetCurrentCommandId(false);
> +
> + return estate;
> +}
>

Yes, that other code looks similar, but I am not sure it is worth
rearranging things for the sake of trying to make use of only 5 or 6
common LOC. Anyway, I felt this review comment is not really related
to the RF patch; It seems more like a potential idea for a future
patch to use some common code *after* the RF code is committed. So I
skipped this comment.

> 3) In one place select is in lower case, it can be changed to upper case
> + resetStringInfo(&cmd);
> + appendStringInfo(&cmd,
> + "SELECT DISTINCT
> pg_get_expr(prqual, prrelid) "
> + " FROM pg_publication p "
> + " INNER JOIN
> pg_publication_rel pr ON (p.oid = pr.prpubid) "
> + " WHERE pr.prrelid
> = %u AND p.pubname IN ( %s ) "
> + " AND NOT (select
> bool_or(puballtables) "
> + " FROM pg_publication "
> + " WHERE pubname
> in ( %s )) "
> + " AND (SELECT count(1)=0 "
> + " FROM
> pg_publication_namespace pn, pg_class c "
> + " WHERE c.oid =
> %u AND c.relnamespace = pn.pnnspid)",
> + lrel->remoteid,
> + pub_names.data,
> + pub_names.data,
> + lrel->remoteid);
>

Fixed in v52-0001 [1]

> 5) Should we mention user should take care of deletion of row filter
> records after table sync is done.
> + ALL TABLES</literal> or <literal>FOR ALL TABLES IN
> SCHEMA</literal> publication,
> + then all other <literal>WHERE</literal> clauses (for the same
> publish operation)
> + become redundant.
> + If the subscriber is a <productname>PostgreSQL</productname>
> version before 15
> + then any row filtering is ignored during the initial data
> synchronization phase.
>
Fixed in v52-0001 [1]

> 6) Should this be an Assert, since this will be taken care earlier by
> GetTransformedWhereClause->transformWhereClause->coerce_to_boolean:
> + expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype,
> BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
> +
> + if (expr == NULL)
> + ereport(ERROR,
> + (errcode(ERRCODE_CANNOT_COERCE),
> + errmsg("row filter returns type %s
> that cannot be cast to the expected type %s",
> + format_type_be(exprtype),
> + format_type_be(BOOLOID)),
> + errhint("You will need to rewrite the
> row filter.")));
>

Not yet fixed in v52-0001 [1], but I did add another regression test for this.

> 7) This code is present in 3 places, can we make it a function or
> macro and use it:
> + if (pub->pubactions.pubinsert)
> + no_filter[idx_ins] = true;
> + if (pub->pubactions.pubupdate)
> + no_filter[idx_upd] = true;
> + if (pub->pubactions.pubdelete)
> + no_filter[idx_del] = true;
> +
> + /* Quick exit loop if all pubactions
> have no row-filter. */
> + if (no_filter[idx_ins] &&
> no_filter[idx_upd] && no_filter[idx_del])
> + break;
> +
> + continue;
>

Fixed in v52-0001 [1]. Added a macro as suggested.

> 8) Can the below transformation be done just before the
> values[Anum_pg_publication_rel_prqual - 1] is set to reduce one of the
> checks:
> + if (pri->whereClause != NULL)
> + {
> + /* Set up a ParseState to parse with */
> + pstate = make_parsestate(NULL);
> +
> + /*
> + * Get the transformed WHERE clause, of boolean type,
> with necessary
> + * collation information.
> + */
> + whereclause = GetTransformedWhereClause(pstate, pri, true);
> + }
>
> /* Form a tuple. */
> memset(values, 0, sizeof(values));
> @@ -328,6 +376,12 @@ publication_add_relation(Oid pubid,
> PublicationRelInfo *targetrel,
> values[Anum_pg_publication_rel_prrelid - 1] =
> ObjectIdGetDatum(relid);
>
> + /* Add qualifications, if available */
> + if (whereclause)
> + values[Anum_pg_publication_rel_prqual - 1] =
> CStringGetTextDatum(nodeToString(whereclause));
> + else
> + nulls[Anum_pg_publication_rel_prqual - 1] = true;
> +
>
> Like:
> /* Add qualifications, if available */
> if (pri->whereClause != NULL)
> {
> /* Set up a ParseState to parse with */
> pstate = make_parsestate(NULL);
>
> /*
> * Get the transformed WHERE clause, of boolean type, with necessary
> * collation information.
> */
> whereclause = GetTransformedWhereClause(pstate, pri, true);
>
> /*
> * Walk the parse-tree of this publication row filter expression and
> * throw an error if anything not permitted or unexpected is
> * encountered.
> */
> rowfilter_walker(whereclause, targetrel);
> values[Anum_pg_publication_rel_prqual - 1] =
> CStringGetTextDatum(nodeToString(whereclause));
> }
> else
> nulls[Anum_pg_publication_rel_prqual - 1] = true;
>

Fixed in v52-0001 [1] as suggested.

------
[1] https://www.postgresql.org/message-id/CAHut%2BPs3BvAqcNXmMMRBUjOe3GWor0d7r%2BmPxxtzMhYEf59t_Q%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-12-22 07:47:12 Re: pg_upgrade should truncate/remove its logs before running
Previous Message Michael Paquier 2021-12-22 07:40:16 Re: Confused comment about drop replica identity index