From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <tmunro(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: Bug in v13 due to "Fix corruption when relation truncation fails." |
Date: | 2025-01-08 04:59:48 |
Message-ID: | CA+hUKGJBK_urEaQ4ii=dU=Mg84CB2yBgu3oM9RKtiBNX6eqafA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 8, 2025 at 4:31 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Thu, Dec 26, 2024 at 01:10:59PM +0300, Yura Sokolov wrote:
> > Commit "Fix corruption when relation truncation fails." [0] makes
> > smgrtruncate be called in a critical section. Unfortunately in version 13 it
> > leads to occasional call to palloc0 inside of critical section, which
> > triggers Assert in debug builds:
> >
> > - smgrtruncate calls mdtruncate
> > - mdtruncate may call register_dirty_segment
> > - register_dirty_segment calls RegisterSyncRequest
> > - RegisterSyncRequest calls ForwardSyncRequest
> > - ForwardSyncRequest may call CompactCheckpointerRequestQueue
> > - CompactCheckpointerRequestQueue may call palloc0 ...
> >
> > In versions 14 and above CompactCheckpointerRequestQueue does check for
> > critical section due to commit "Fix bugs in MultiXact truncation" [1],
> > which were backported down to 14 version, but not 13.
> >
> > We caught it in our private tests, so it is real.
> > Cherry-pick of [1] from 14 version to 13 solves the issue.
> >
> > [0] https://git.postgresql.org/gitweb/?p=postgresql.git;h=2280912165d
> > [1] https://git.postgresql.org/gitweb/?p=postgresql.git;h=4c8e00ae9ae
>
> Ah, you're right. This chain of events is possible in REL_13_STABLE,
> and only there. Thomas?
Yeah. 4c8e00ae9ae addressed two separate issues: (1) another critical
section thing just like this one, and this one in passing, and (2) a
deadlock risk involving checkpoint delays. I had actually worried
about that exact deadlock scenario a bit earlier[1] and figured out
that it couldn't happen in this code path, so I guess we technically
only need to back-patch the couple of lines with the fast return, but
it seems harmless, simple and good to back-patch Heikki's whole patch
from 14->13. I will study this a bit more and then go ahead and do
that. Thanks for the report, Yura, and the reminder, Michael.
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2025-01-08 05:24:56 | Re: Add “FOR UPDATE NOWAIT” lock details to the log. |
Previous Message | Triveni N | 2025-01-08 04:41:07 | Re: NOT ENFORCED constraint feature |