RE: [BUG?] check_exclusion_or_unique_constraint false negative

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: RE: [BUG?] check_exclusion_or_unique_constraint false negative
Date: 2024-08-12 03:32:32
Message-ID: OS0PR01MB5716FFD8DBBADB55E8E6935994852@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for reporting the issue !

I tried to reproduce this in logical replication but failed. If possible,
could you please share some steps to reproduce it in logicalrep context ?

In my test, if the tuple is updated and new tuple is in the same page,
heapam_index_fetch_tuple should find the new tuple using HOT chain. So, it's a
bit unclear to me how the updated tuple is missing. Maybe I missed some other
conditions for this issue.

It would be better if we can reproduce this by adding some breakpoints using
gdb, which may help us to write a tap test using injection point to reproduce
this reliably. I see the tap test you shared used pgbench to reproduce this,
it works, but It would be great if we can analyze the issue more deeply by
debugging the code.

And I have few questions related the steps you shared:

> * Session 1 reads a B-tree page using SnapshotDirty and copies item X to the buffer.
> * Session 2 updates item X, inserting a new TID Y into the same page.
> * Session 2 commits its transaction.
> * Session 1 starts to fetch from the heap and tries to fetch X, but it was
> already deleted by session 2. So, it goes to the B-tree for the next TID.
> * The B-tree goes to the next page, skipping Y.
> * Therefore, the search finds nothing, but tuple Y is still alive.

I am wondering at which point should the update happen ? should it happen after
calling index_getnext_tid and before index_fetch_heap ? It would be great if
you could give more details in above steps. Thanks !

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2024-08-12 03:33:05 RE: Conflict detection and logging in logical replication
Previous Message Thomas Munro 2024-08-12 03:24:01 Re: Remaining dependency on setlocale()