Re: MultiXact\SLRU buffers configuration

From: Thom Brown <thom(at)linux(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, vignesh C <vignesh21(at)gmail(dot)com>, Andrew Borodin <amborodin86(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Gilles Darold <gilles(at)darold(dot)net>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MultiXact\SLRU buffers configuration
Date: 2024-10-29 17:45:02
Message-ID: CAA-aLv7AcTh+O9rnoPVbck=Fw8atMOYrUttk5G9ctFhXOsqQbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 29 Oct 2024 at 16:38, Thom Brown <thom(at)linux(dot)com> wrote:
>
> On Fri, 23 Aug 2024 at 01:29, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>>
>> On Thu, Aug 22, 2024 at 10:36:38AM -0400, Alvaro Herrera wrote:
>> > On 2024-Aug-22, Michael Paquier wrote:
>> >> I'm not sure that we need to get down to that until somebody has a
>> >> case where they want to rely on stats of injection points for their
>> >> stuff. At this stage, I only want the stats to be enabled to provide
>> >> automated checks for the custom pgstats APIs, so disabling it by
>> >> default and enabling it only in the stats test of the module
>> >> injection_points sounds kind of enough to me for now.
>> >
>> > Oh! I thought the stats were useful by themselves.
>>
>> Yep, currently they're not, but I don't want to discard that they'll
>> never be, either. Perhaps there would be a case where somebody would
>> like to run a callback N times and trigger a condition? That's
>> something where the stats could be useful, but I don't have a specific
>> case for that now. I'm just imagining possibilities.
>
>
> I believe I am seeing the problem being discussed occuring on a production system running 15.6, causing ever-increasing replay lag on the standby, until I cancel the offending process on the standby and force it to process its interrupts.
>
> Here's the backtrace before I do that:
>
> #0 0x00007f4503b81876 in select () from /lib64/libc.so.6
> #1 0x0000558b0956891a in pg_usleep (microsec=microsec(at)entry=1000) at pgsleep.c:56
> #2 0x0000558b0917e01a in GetMultiXactIdMembers (from_pgupgrade=false, onlyLock=<optimized out>, members=0x7ffcd2a9f1e0, multi=109187502) at multixact.c:1392
> #3 GetMultiXactIdMembers (multi=109187502, members=members(at)entry=0x7ffcd2a9f1e0, from_pgupgrade=from_pgupgrade(at)entry=false, onlyLock=onlyLock(at)entry=false) at multixact.c:1224
> #4 0x0000558b0913de15 in MultiXactIdGetUpdateXid (xmax=<optimized out>, t_infomask=<optimized out>) at heapam.c:6924
> #5 0x0000558b09146028 in HeapTupleGetUpdateXid (tuple=tuple(at)entry=0x7f440d428308) at heapam.c:6965
> #6 0x0000558b0914c02f in HeapTupleSatisfiesMVCC (htup=0x558b0b7cbf20, htup=0x558b0b7cbf20, buffer=8053429, snapshot=0x558b0b63a2d8) at heapam_visibility.c:1089
> #7 HeapTupleSatisfiesVisibility (tup=tup(at)entry=0x7ffcd2a9f2b0, snapshot=snapshot(at)entry=0x558b0b63a2d8, buffer=buffer(at)entry=8053429) at heapam_visibility.c:1771
> #8 0x0000558b0913e819 in heapgetpage (sscan=sscan(at)entry=0x558b0b7ccfa0, page=page(at)entry=115) at heapam.c:468
> #9 0x0000558b0913eb7e in heapgettup_pagemode (scan=scan(at)entry=0x558b0b7ccfa0, dir=ForwardScanDirection, nkeys=0, key=0x0) at heapam.c:1120
> #10 0x0000558b0913fb5e in heap_getnextslot (sscan=0x558b0b7ccfa0, direction=<optimized out>, slot=0x558b0b7cc000) at heapam.c:1352
> #11 0x0000558b092c0e7a in table_scan_getnextslot (slot=0x558b0b7cc000, direction=ForwardScanDirection, sscan=<optimized out>) at ../../../src/include/access/tableam.h:1046
> #12 SeqNext (node=0x558b0b7cbe10) at nodeSeqscan.c:80
> #13 0x0000558b0929b9bf in ExecScan (node=0x558b0b7cbe10, accessMtd=0x558b092c0df0 <SeqNext>, recheckMtd=0x558b092c0dc0 <SeqRecheck>) at execScan.c:198
> #14 0x0000558b09292cb2 in ExecProcNode (node=0x558b0b7cbe10) at ../../../src/include/executor/executor.h:262
> #15 ExecutePlan (execute_once=<optimized out>, dest=0x558b0bca1350, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x558b0b7cbe10, estate=0x558b0b7cbbe8)
> at execMain.c:1636
> #16 standard_ExecutorRun (queryDesc=0x558b0b8c9798, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:363
> #17 0x00007f44f64d43c5 in pgss_ExecutorRun (queryDesc=0x558b0b8c9798, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:1010
> #18 0x0000558b093fda0f in PortalRunSelect (portal=portal(at)entry=0x558b0b6ba458, forward=forward(at)entry=true, count=0, count(at)entry=9223372036854775807, dest=dest(at)entry=0x558b0bca1350) at pquery.c:924
> #19 0x0000558b093fedb8 in PortalRun (portal=portal(at)entry=0x558b0b6ba458, count=count(at)entry=9223372036854775807, isTopLevel=isTopLevel(at)entry=true, run_once=run_once(at)entry=true, dest=dest(at)entry=0x558b0bca1350,
> altdest=altdest(at)entry=0x558b0bca1350, qc=0x7ffcd2a9f7a0) at pquery.c:768
> #20 0x0000558b093fb243 in exec_simple_query (
> query_string=0x558b0b6170c8 "<redacted>")
> at postgres.c:1250
> #21 0x0000558b093fd412 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4598
> #22 0x0000558b0937e170 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4514
> #23 BackendStartup (port=<optimized out>) at postmaster.c:4242
> #24 ServerLoop () at postmaster.c:1809
> #25 0x0000558b0937f147 in PostmasterMain (argc=argc(at)entry=5, argv=argv(at)entry=0x558b0b5d0a30) at postmaster.c:1481
> #26 0x0000558b09100a2c in main (argc=5, argv=0x558b0b5d0a30) at main.c:202
>
> This occurred twice, meaning 2 processes needed terminating.

Taking a look at what's happening under the hood, it seems to be
getting stuck here:

if (nextMXOffset == 0)
{
/* Corner case 2: next multixact is still
being filled in */
LWLockRelease(MultiXactOffsetSLRULock);
CHECK_FOR_INTERRUPTS();
pg_usleep(1000L);
goto retry;
}

It clearly checks for interrupts, but when I saw this issue happen, it
wasn't interruptible.

I looked around for similar reports, and I found a fork of Postgres
reporting the same issue, but with more info than I thought to get at
the time. They determined that the root cause involved a deadlock
where the replay thread is stuck due to an out-of-order multixact
playback sequence. The standby process waits indefinitely for a page
pin to release before proceeding, but the required multixact ID hasn't
yet been replayed due to log ordering. What they did to "fix" the
issue in their version was to add an error when the reading thread
cannot retrieve the next multixact ID, allowing it to exit instead of
becoming indefinitely stuck. This took the form of only allowing it to
loop 100 times before exiting.

Thom

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2024-10-29 17:47:33 Re: RFC: Extension Packaging & Lookup
Previous Message Daniel Gustafsson 2024-10-29 17:41:27 Re: [PoC] Federated Authn/z with OAUTHBEARER