Re: Get rid of WALBufMappingLock

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Victor Yegorov <vyegorov(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Get rid of WALBufMappingLock
Date: 2025-02-28 07:27:29
Message-ID: Z8FlYYhQBOkWMHuj@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 26, 2025 at 01:48:47PM +0200, Alexander Korotkov wrote:
> Thank you for offering the help. Updated version of patch is attached
> (I've added one memory barrier there just in case). I would
> appreciate if you could run it on batta, hachi or similar hardware.

Doing a revert of the revert done in 3fb58625d18f proves that
reproducing the error is not really difficult. I've done a make
installcheck-world USE_MODULE_DB=1 -j N without assertions, and the
point that saw a failure quickly is one of the tests of pgbench:
PANIC: could not find WAL buffer for 0/19D366

This one happened for the test "concurrent OID generation" and CREATE
TYPE. Of course, as it is a race condition, it is random, but it's
taking me only a couple of minutes to see the original issue on my
buildfarm host. With assertion failures enabled, same story, and same
failure from the pgbench TAP test.

Saying that, I have also done similar tests with your v12 for a couple
of hours and this looks stable under installcheck-world. I can see
that you've reworked quite a bit the surroundings of InitializedFrom
in this one. If you apply that once again at some point, the
buildfarm will be judge in the long-term, but I am rather confident by
saying that the situation looks better here, at least.

One thing I would consider doing if you want to gain confidence is
tests like the one I saw causing problems with pgbench, with DDL
patterns stressing specific paths like this CREATE TYPE case.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2025-02-28 08:06:59 Re: Orphaned users in PG16 and above can only be managed by Superusers
Previous Message Amit Kapila 2025-02-28 06:42:12 Re: long-standing data loss bug in initial sync of logical replication