RE: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'li jie' <ggysxcq(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Proposal: Filter irrelevant change before reassemble transactions during logical decoding
Date: 2024-05-20 04:58:31
Message-ID: OSBPR01MB255280A8AD442EE82DF8B604F5E92@OSBPR01MB2552.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Li,

Thanks for proposing and sharing the PoC. Here are my high-level comments.

1.
What if ALTER PUBLICATION ... ADD is executed in parallel?
Should we publish added tables if the altering is done before the transaction is
committed? The current patch seems unable to do so because changes for added
tables have not been queued at COMMIT.
If we should not publish such tables, why?

2.
This patch could not apply as-is. Please rebase.

3. FilterByTable()

```
+ if (ctx->callbacks.filter_by_origin_cb == NULL)
+ return false;
```

filter_by_table_cb should be checked instead of filter_by_origin_cb.
Current patch crashes if the filter_by_table_cb() is not implemented.

4. DecodeSpecConfirm()

```
+ if (FilterByTable(ctx, change))
+ return;
+
```

I'm not sure it is needed. Can you explain the reason why you added?

5. FilterByTable

```
+ switch (change->action)
+ {
+ /* intentionally fall through */
+ case REORDER_BUFFER_CHANGE_INSERT:
+ case REORDER_BUFFER_CHANGE_UPDATE:
+ case REORDER_BUFFER_CHANGE_DELETE:
+ break;
+ default:
+ return false;
+ }
```

IIUC, REORDER_BUFFER_CHANGE_TRUNCATE also targes the user table, so I think
it should be accepted. Thought?

6.

I got strange errors when I tested the feature. I thought this implied there were
bugs in your patch.

1. implemented no-op filter atop test_decoding like attached
2. ran `make check` for test_decoding modle
3. some tests failed. Note that "filter" was a test added by me.
regression.diffs was also attached.

```
not ok 1 - ddl 970 ms
ok 2 - xact 36 ms
not ok 3 - rewrite 525 ms
not ok 4 - toast 736 ms
ok 5 - permissions 50 ms
ok 6 - decoding_in_xact 39 ms
not ok 7 - decoding_into_rel 57 ms
ok 8 - binary 21 ms
not ok 9 - prepared 33 ms
ok 10 - replorigin 93 ms
ok 11 - time 25 ms
ok 12 - messages 47 ms
ok 13 - spill 8063 ms
ok 14 - slot 124 ms
ok 15 - truncate 37 ms
not ok 16 - stream 60 ms
ok 17 - stats 157 ms
ok 18 - twophase 122 ms
not ok 19 - twophase_stream 57 ms
not ok 20 - filter 20 ms
```

Currently I'm not 100% sure the reason, but I think it may read the latest system
catalog even if ALTER SUBSCRIPTION is executed after changes.
In below example, an attribute is altered text->somenum, after inserting data.
But get_changes() outputs as somenum.

```
BEGIN
- table public.replication_example: INSERT: id[integer]:1 somedata[integer]:1 text[character varying]:'1'
- table public.replication_example: INSERT: id[integer]:2 somedata[integer]:1 text[character varying]:'2'
+ table public.replication_example: INSERT: id[integer]:1 somedata[integer]:1 somenum[character varying]:'1'
+ table public.replication_example: INSERT: id[integer]:2 somedata[integer]:1 somenum[character varying]:'2'
COMMIT
```

Also, if the relfilenuber of the relation is changed, an ERROR is raised.

```
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
- data
-----------------------------------------------------------------------------
- BEGIN
- table public.tr_pkey: INSERT: id2[integer]:2 data[integer]:1 id[integer]:2
- COMMIT
- BEGIN
- table public.tr_pkey: DELETE: id[integer]:1
- table public.tr_pkey: DELETE: id[integer]:2
- COMMIT
-(7 rows)
-
+ERROR: could not map filenumber "base/16384/16397" to relation OID
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachment Content-Type Size
0001-WIP-implement-no-op-filter.txt text/plain 4.7 KB
regression.diffs application/octet-stream 163.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-05-20 05:49:40 Re: commitfest.postgresql.org is no longer fit for purpose
Previous Message Masahiko Sawada 2024-05-20 04:58:28 Re: Lowering the minimum value for maintenance_work_mem