Re: row filtering for logical replication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2022-01-07 07:39:44
Message-ID: CAHut+PvC7XFEJDFpEdaAneNUNv9Lo8O9SjEQyzUsBObrdkwTaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Below are some review comments for the v60 patches.

V60-0001 Review Comments
========================

1. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter
unnecessary parens

+ /*
+ * NOTE: Multiple publication row filters have already been combined to a
+ * single exprstate (for this pubaction).
+ */
+ if (entry->exprstate[changetype])
+ {
+ /* Evaluates row filter */
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[changetype], ecxt);
+ }

This is a single statement so it may be better to rearrange by
removing the unnecessary parens and moving the comment.

e.g.

/*
* Evaluate the row filter.
*
* NOTE: Multiple publication row filters have already been combined to a
* single exprstate (for this pubaction).
*/
if (entry->exprstate[changetype])
result = pgoutput_row_filter_exec_expr(entry->exprstate[changetype], ecxt);

v60-0002 Review Comments
========================

1. Commit Message

Some parts of this message could do with some minor re-wording. Here
are some suggestions:

1a.
BEFORE:
Also tuples that have been deformed will be cached in slots to avoid
multiple deforming of tuples.
AFTER:
Also, tuples that are deformed will be cached in slots to avoid unnecessarily
deforming again.

1b.
BEFORE:
However, after the UPDATE the new
tuple doesn't satisfy the row filter then, from the data consistency
perspective, that row should be removed on the subscriber.
AFTER:
However, after the UPDATE the new
tuple doesn't satisfy the row filter, so from a data consistency
perspective, that row should be removed on the subscriber.

1c.
BEFORE:
Keep this row on the subscriber is undesirable because it...
AFTER
Leaving this row on the subscriber is undesirable because it...

1d.
BEFORE:
However, after the UPDATE
the new tuple does satisfies the row filter then, from the data
consistency perspective, that row should inserted on the subscriber.
AFTER:
However, after the UPDATE
the new tuple does satisfy the row filter, so from a data
consistency perspective, that row should be inserted on the subscriber.

1e.
"Subsequent UPDATE or DELETE statements have no effect."

Why won't they have an effect? The first impression is the newly
updated tuple now matches the filter, I think this part seems to need
some more detailed explanation. I saw there are some slightly
different details in the header comment of the
pgoutput_row_filter_update_check function - does it help?

~~

2. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter decl

+static bool pgoutput_row_filter(enum ReorderBufferChangeType
changetype, EState *relation, Oid relid,
+ HeapTuple oldtuple, HeapTuple newtuple,
+ TupleTableSlot *slot, RelationSyncEntry *entry);

The 2nd parameter should be called "EState *estate" (not "EState *relation").

~~

3. src/backend/replication/pgoutput/pgoutput.c -
pgoutput_row_filter_update_check header comment

This function header comment looks very similar to an extract from the
0002 comment message. So any wording improvements made to the commit
message (see review comment #1) should be made here in this comment
too.

~~

4. src/backend/replication/pgoutput/pgoutput.c -
pgoutput_row_filter_update_check inconsistencies, typos, row-filter

4a. The comments here are mixing terms like "oldtuple" / "old tuple" /
"old_tuple", and "newtuple" / "new tuple" / "new_tuple". I feel it
would read better just saying "old tuple" and "new tuple" within the
comments.

4b. Typo: "row-filter" --> "row filter" (for consistency with every
other usage where the hyphen is removed)

~~

5. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter
unnecessary parens

+ /*
+ * The default behavior for UPDATEs is to use the new tuple for row
+ * filtering. If the UPDATE requires a transformation, the new tuple will
+ * be replaced by the transformed tuple before calling this routine.
+ */
+ if (newtuple || oldtuple)
+ ExecStoreHeapTuple(newtuple ? newtuple : oldtuple,
ecxt->ecxt_scantuple, false);
+ else
+ {
+ ecxt->ecxt_scantuple = slot;
+ }

The else is a single statement so the parentheses are not needed here.

~~

6. src/include/replication/logicalproto.h

+extern void logicalrep_write_update_cached(StringInfo out,
TransactionId xid, Relation rel,
+ TupleTableSlot *oldtuple, TupleTableSlot *newtuple,
+ bool binary);

This extern seems unused ???

--------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-01-07 07:44:47 Re: preserve timestamps when installing headers
Previous Message Jeff Davis 2022-01-07 07:24:40 Re: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers