From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com>, "Callahan, Drew" <callaan(at)amazon(dot)com> |
Cc: | "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | RE: Potential data loss due to race condition during logical replication slot creation |
Date: | 2024-03-13 09:34:22 |
Message-ID: | TYCPR01MB1207719C811F580A8774C79B7F52A2@TYCPR01MB12077.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Dear hackers,
While analyzing another failure [1], I found here. I think they occurred by the
same reason.
The reported failure occurred when the replication slot is created in the middle
of the transaction and it reuses the snapshot from other slot. The reproducer is:
```
Session0
SELECT pg_create_logical_replication_slot('slot0', 'test_decoding');
BEGIN;
INSERT INTO foo ...
Session1
SELECT pg_create_logical_replication_slot('slot1', 'test_decoding');
Session2
CHECKPOINT;
SELECT pg_logical_slot_get_changes('slot0', NULL, NULL);
Session0
INSERT INTO var ... // var is defined with (user_catalog_table = true)
COMMIT;
Session1
SELECT pg_logical_slot_get_changes('slot1', NULL, NULL);
-> Assertion failure.
```
> Here is the summary of several proposals we've discussed:
> a) Have CreateInitDecodingContext() always pass need_full_snapshot =
> true to AllocateSnapshotBuilder().
> b) Have snapbuild.c being able to handle multiple SnapBuildOnDisk versions.
> c) Add a global variable, say in_create, to snapbuild.c
Regarding three options raised by Sawada-san, I preferred the approach a).
Since the issue could happen for all supported branches, we should choose the
conservative approach. Also, it is quite painful if there are some codes for
handling the same issue.
Attached patch implemented the approach a) since no one made. I also added
the test which can do assertion failure, but not sure it should be included.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/
Attachment | Content-Type | Size |
---|---|---|
master_0001-fix-snapbuild-bug-by-approach-a.patch | application/octet-stream | 13.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Ronan Dunklau | 2024-03-13 09:37:21 | Re: FSM Corruption (was: Could not read block at end of the relation) |
Previous Message | Kristo Marijo | 2024-03-13 09:22:42 | AW: BUG #18389: pg_database_owner not recognized with alter default privileges |