Re: row filtering for logical replication

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-13 21:06:36
Message-ID: 202107132106.wvjgvjgcyezo@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-Jul-13, Euler Taveira wrote:

> + <para>
> + The <literal>WHERE</literal> clause should contain only columns that are
> + part of the primary key or be covered by <literal>REPLICA
> + IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
> + be replicated. That's because old row is used and it only contains primary
> + key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
> + remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
> + and <command>UPDATE</command> operations, any column might be used in the
> + <literal>WHERE</literal> clause. New row is used and it contains all
> + columns. A <literal>NULL</literal> value causes the expression to evaluate
> + to false; avoid using columns without not-null constraints in the
> + <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
> + not allow functions and user-defined operators.
> + </para>

There's a couple of points in this paragraph ..

1. if you use REPLICA IDENTITY FULL, then the expressions would work
even if they use any other column with DELETE. Maybe it would be
reasonable to test for this in the code and raise an error if the
expression requires a column that's not part of the replica identity.
(But that could be relaxed if the publication does not publish
updates/deletes.)

2. For UPDATE, does the expression apply to the old tuple or to the new
tuple? You say it's the new tuple, but from the user point of view I
think it would make more sense that it would apply to the old tuple.
(Of course, if you're thinking that the R.I. is the PK and the PK is
never changed, then you don't really care which one it is, but I bet
that some people would not like that assumption.)

I think it is sensible that it's the old tuple that is matched, not the
new; consider what happens if you change the PK in the update and the
replica already has that tuple. If you match on the new tuple and it
doesn't match the expression (so you filter out the update), but the old
tuple does match the expression, then the replica will retain the
mismatching tuple forever.

3. You say that a NULL value in any of those columns causes the
expression to become false and thus the tuple is not published. This
seems pretty unfriendly, but maybe it would be useful to have examples
of the behavior. Does ExecInitCheck() handle things in the other way,
and if so does using a similar trick give more useful behavior?

<para>
The WHERE clause may only contain references to columns that are part
of the table's replica identity.
If <>DELETE</> or <>UPDATE</> operations are published, this
restriction can be bypassed by making the replica identity be the whole
row with <command>ALTER TABLE .. SET REPLICA IDENTITY FULL</command>.
The <literal>WHERE</literal> clause does not allow functions or
user-defined operators.
</para>

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
"The Gord often wonders why people threaten never to come back after they've
been told never to return" (www.actsofgord.com)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message r.takahashi_2@fujitsu.com 2021-07-13 21:34:19 RE: Transactions involving multiple postgres foreign servers, take 2
Previous Message Euler Taveira 2021-07-13 21:02:43 Re: row filtering for logical replication