Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: li jie <ggysxcq(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding
Date: 2024-06-05 01:24:45
Message-ID: CAFPTHDbS_accCqVwAgk48eWL-w4o3jwS+YENqLr2oyPNrSEfDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 22, 2024 at 2:17 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:

>
> The reason for the crash is that the RelationSyncCache was NULL prior to
> reaching a consistent point.
> Hi li jie, I see that you created a new thread with an updated version of
> this patch [1]. I used that patch and addressed the crash seen above,
> rebased the patch and addressed a few other comments.
> I'm happy to help you with this patch and address comments if you are not
> available.
>
> regards,
> Ajin Cherian
> Fujitsu Australia
> [1] -
> https://www.postgresql.org/message-id/CAGfChW7%2BZMN4_NHPgz24MM42HVO83ecr9TLfpihJ%3DM0s1GkXFw%40mail.gmail.com
>

I was discussing this with Kuroda-san who made a patch to add a table
filter with test_decoding plugin. The filter does nothing, just returns
false( doesn't filter anything) and I see that 8 test_decoding tests fail.
In my analysis, I could see that some of the failures were because the new
filter logic was accessing the relation cache using the latest snapshot for
relids which was getting incorrect relation information while decoding
attribute values.
for eg:
CREATE TABLE replication_example(id SERIAL PRIMARY KEY, somedata int, text
varchar(120));
BEGIN;
INSERT INTO replication_example(somedata, text) VALUES (1, 1);
INSERT INTO replication_example(somedata, text) VALUES (1, 2);
COMMIT;
ALTER TABLE replication_example RENAME COLUMN text TO somenum;
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1');

Here, after decoding, the changes for the INSERT, were reflecting the new
column name (somenum) which was altered in a later transaction. This is
because the new Filterby Table() logic was getting relcache with latest
snapshot and does not use a historic snapshot like logical decoding should
be doing. This is because the changes are at the decode.c level and not the
reorderbuffer level and does not have access to the txn from the
reorderbuffer. This problem could be fixed by invalidating the cache in the
FilterByTable() logic, but this does not solve other problems like the
table name itself is changed in a later transaction. I think the patch has
a fundamental problem that the table filter logic does not respect the
snapshot of the transaction being decoded. The historic snapshot is
currently only set up when the actual changes are committed or streamed at
ReorderBufferProcessTXN().

If the purpose of the patch is to filter out unnecessary changes prior to
actual decode, then it will use an invalid snapshot and have lots of
problems. Otherwise this logic has to be moved to the reorderbuffer level
and there will be a big overhead of extracting reorderbuffer while each
change is queued in memory/disk.
regards,
Ajin Cherian
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii.Yuki@df.MitsubishiElectric.co.jp 2024-06-05 01:26:30 RE: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Previous Message David E. Wheeler 2024-06-05 00:45:03 Re: Patch bug: Fix jsonpath .* on Arrays