Re: BUG #17949: Adding an index introduces serialisation anomalies.

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: artem(dot)anisimov(dot)255(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17949: Adding an index introduces serialisation anomalies.
Date: 2023-06-16 23:22:48
Message-ID: CA+hUKGLE3SONfzBbj3q7T7qYx5FQ9Pk_iGx6N7nYim1qgRrvUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Jun 15, 2023 at 7:29 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
> I've tried to reproduce it as well, adding more logging around the
> serialization code. If it helps, what I observe is the second
> overlapping transaction, that has started a bit later, do not error out
> because in OnConflict_CheckForSerializationFailure (when checking for
> "writer has become a pivot") there are no more conflicts received from
> SHMQueueNext. All the rest of the reported serialization conflicts are
> coming from this check, so I assume the incorrect transaction should
> fail there too. Not sure yet why is that so.

Some more observations: happens on 11 and master, happens with btrees,
happens with bitmapscan disabled (eg with plain index scan), but so
far in my testing it doesn't happen if the table already contains one
other tuple (ie if you change the reproducer to insert another row
('foo') after the TRUNCATE). There is a special case for predicate
locking empty indexes, which uses a relation-level (since there are no
pages to lock yet), but that doesn't seem to be wrong and if you hack
it to lock pages 1 and 2 instead, it still reproduces. Pondering the
empty index case made me wonder if the case "If we found one of our
own SIREAD locks to remove, remove it now" was implicated (that's
something that would not happen for a relation-level lock), but it
still reproduces if you comment out that optimisation. So far I have
not been able to reproduce it below 8 threads. Hmm, I wonder if there
might be a missing check/lock in some racy code path around the
initial creation of the root page...

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alexander Lakhin 2023-06-17 14:00:00 Re: BUG #17950: Incorrect memory access in gtsvector_picksplit()
Previous Message Jeff Davis 2023-06-16 17:29:30 Re: pg_dump assertion failure with "-n pg_catalog"