From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: row filtering for logical replication |
Date: | 2022-01-13 06:49:20 |
Message-ID: | CAHut+Pv3ow1pmQZ5MQme8Zgie_QsD0_B-BCrRi7pJt-32rSbSQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thanks for posting the merged v63.
Here are my review comments for the v63-0001 changes.
~~~
1. src/backend/replication/logical/proto.c - logicalrep_write_tuple
TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum *values;
+ bool *isnull;
int i;
uint16 nliveatts = 0;
Those separate declarations for values / isnull are not strictly
needed anymore, so those vars could be deleted. IIRC those were only
added before when there were both slots and tuples. OTOH, maybe you
prefer to keep it this way just for code readability?
~~~
2. src/backend/replication/pgoutput/pgoutput.c - typedef
+typedef enum RowFilterPubAction
+{
+ PUBACTION_INSERT,
+ PUBACTION_UPDATE,
+ PUBACTION_DELETE,
+ NUM_ROWFILTER_PUBACTIONS /* must be last */
+} RowFilterPubAction;
This typedef is not currently used by any of the code.
So I think choices are:
- Option 1: remove the typedef, because nobody is using it.
- Option 2: keep the typedef, but use it! e.g. everywhere there is an
exprstate array index variable probably it should be declared as a
'RowFilterPubAction idx' instead of just 'int idx'.
I prefer option 2, but YMMV.
~~~
3. src/backend/replication/pgoutput/pgoutput.c - map_changetype_pubaction
After this recent v63 refactoring and merging of some APIs it seems
that the map_changetype_pubaction is now ONLY used by
pgoutput_row_filter function. So this can now be a static member of
pgoutput_row_filter function instead of being declared at file scope.
~~~
4. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter comments
+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ *
+ * FOR INSERT: evaluates the row filter for new tuple.
+ * FOR DELETE: evaluates the row filter for old tuple.
+ * For UPDATE: evaluates the row filter for old and new tuple. If both
+ * evaluations are true, it sends the UPDATE. If both evaluations are false, it
+ * doesn't send the UPDATE. If only one of the tuples matches the row filter
+ * expression, there is a data consistency issue. Fixing this issue requires a
+ * transformation.
+ *
+ * Transformations:
+ * Updates are transformed to inserts and deletes based on the
+ * old tuple and new tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ *
+ * Case 1: old-row (no match) new-row (no match) -> (drop change)
+ * Case 2: old-row (no match) new row (match) -> INSERT
+ * Case 3: old-row (match) new-row (no match) -> DELETE
+ * Case 4: old-row (match) new row (match) -> UPDATE
+ *
+ * If the change is to be replicated this function returns true, else false.
+ *
+ * Examples:
The function header comment says the same thing 2x about the return values.
The 1st text "If it returns true, the change is replicated, otherwise,
it is not." should be replaced by the better wording of the 2nd text
("If the change is to be replicated this function returns true, else
false."). Then, that 2nd text can be removed (from where it is later
in this same comment).
~~~
5. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter
+ ExprContext *ecxt;
+ int filter_index = map_changetype_pubaction[*action];
+
+ /* *action is already assigned default by caller */
+ Assert(*action == REORDER_BUFFER_CHANGE_INSERT ||
+ *action == REORDER_BUFFER_CHANGE_UPDATE ||
+ *action == REORDER_BUFFER_CHANGE_DELETE);
+
Accessing the map_changetype_pubaction array should be done *after* the Assert.
~~~
6. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter
Actually, instead of assigning the filter_insert and then referring to
entry->exprstate[filter_index] in multiple places, now the code might
be neater if we simply assign a local variable “filter_exprstate” like
below and use that instead.
ExprState *filter_exprstate;
...
filter_exprstate = entry->exprstate[map_changetype_pubaction[*action]];
Please consider what way you think is best.
~~~
7. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter
+ /*
+ * For the following occasions where there is only one tuple, we can
+ * evaluates the row filter for the not null tuple and return.
+ *
+ * For INSERT: we only have new tuple.
+ *
+ * For UPDATE: if no old tuple, it means none of the replica identity
+ * columns changed and this would reduce to a simple update. we only need
+ * to evaluate the row filter for new tuple.
+ *
+ * FOR DELETE: we only have old tuple.
+ */
There are several things not quite right with that comment:
a. I thought now it should refer to "slots" instead of "tuples"
b. Some of the upper/lowercase is wonky
c. Maybe it reads better without the ":"
Suggested replacement comment:
/*
* For the following occasions where there is only one slot, we can
* evaluates the row filter for the not-null slot and return.
*
* For INSERT we only have the new slot.
*
* For UPDATE if no old slot, it means none of the replica identity
* columns changed and this would reduce to a simple update. We only need
* to evaluate the row filter for the new slot.
*
* For DELETE we only have the old slot.
*/
~~~
8. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter
+ if (!new_slot || !old_slot)
+ {
+ ecxt->ecxt_scantuple = new_slot ? new_slot : old_slot;
+ result = pgoutput_row_filter_exec_expr(entry->exprstate[filter_index],
+ ecxt);
+
+ FreeExecutorState(estate);
+ PopActiveSnapshot();
+
+ return result;
+ }
+
+ tmp_new_slot = new_slot;
+ slot_getallattrs(new_slot);
+ slot_getallattrs(old_slot);
I think after this "if" condition then the INSERT, DELETE and simple
UPDATE are already handled. So, the remainder of the code is for
deciding what update transformation is needed etc.
I think there should be some block comment somewhere here to make that
more obvious.
~~~
9. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter
Many of the comments in this function are still referring to old/new
"tuple". Now that all the params are slots instead of tuples maybe now
all the comments should also refer to "slots" instead of "tuples".
Please search all the comments - e.g. including all the "Case 1:" ...
"Case 4:" comments.
~~~
10. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter_init
+ int idx_ins = PUBACTION_INSERT;
+ int idx_upd = PUBACTION_UPDATE;
+ int idx_del = PUBACTION_DELETE;
These variables are unnecessary now... They previously were added only
as short synonyms because the other enum names were too verbose (e.g.
REORDER_BUFFER_CHANGE_INSERT) but now that we have shorter enum names
like PUBACTION_INSERT we can just use those names directly
~~~
11. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter_init
I felt that the code would seem more natural if the
pgoutput_row_filter_init function came *before* the
pgoutput_row_filter function in the source code.
~~~
12. src/backend/replication/pgoutput/pgoutput.c - pgoutput_change
@@ -634,6 +1176,9 @@ pgoutput_change(LogicalDecodingContext *ctx,
ReorderBufferTXN *txn,
RelationSyncEntry *relentry;
TransactionId xid = InvalidTransactionId;
Relation ancestor = NULL;
+ ReorderBufferChangeType modified_action = change->action;
+ TupleTableSlot *old_slot = NULL;
+ TupleTableSlot *new_slot = NULL;
It seemed a bit misleading to me to call this variable
'modified_action' since mostly it is not modified at all.
IMO it is better just to call this as 'action' but then add a comment
(above the "switch (modified_action)") to say the previous call to
pgoutput_row_filter may have transformed it to a different action.
~~~
13. src/tools/pgindent/typedefs.list - RowFilterPubAction
If you choose to keep the typedef for RowFilterPubAction (ref to
comment #1) then it should also be added to the typedefs.list.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | tanghy.fnst@fujitsu.com | 2022-01-13 06:57:42 | RE: [PATCH]Add tab completion for foreigh table |
Previous Message | Amit Langote | 2022-01-13 06:39:12 | Re: ExecRTCheckPerms() and many prunable partitions |