Re: BUG #16811: Severe reproducible server backend crash

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: james(dot)inform(at)pharmapp(dot)de, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16811: Severe reproducible server backend crash
Date: 2021-01-07 10:14:07
Message-ID: CA+hUKG+KoZFkX7YR2u31TeQPOu=1nc6EPRdVU-Pn0ohpUJdGdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Jan 7, 2021 at 10:25 PM PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
> First things first: Happy new year to all of you and stay healthy during
> these days.
>
> I have run into a severe backend crash that makes the whole PostgreSQL
> shutting down and restart with recovering message.
>
> This is reproducible on Mac using the latest version of the Postgresapp
> (www.postgresapp.com) which comes with PG13.1 as well as on Ubuntu with a
> self build PG 13.1 and also with the latest version from REL_13_STABLE
> branch.
>
> The issue doesn't exist on PG 12.5. Wether on Mac nor on Ubuntu.

Thanks for the report. I happened to have DBeaver here and could
reproduce this, and got the following core:

#0 0x00005651d5c07927 in lnext (l=0x0, c=0x5651d794aca0) at
../../../src/include/nodes/pg_list.h:312
312 Assert(c >= &l->elements[0] && c < &l->elements[l->length]);
(gdb) bt
#0 0x00005651d5c07927 in lnext (l=0x0, c=0x5651d794aca0) at
../../../src/include/nodes/pg_list.h:312
#1 0x00005651d5c096c9 in PortalRunMulti (portal=0x5651d79900a0,
isTopLevel=true, setHoldSnapshot=false,
dest=0x5651d60a1580 <donothingDR>, altdest=0x5651d60a1580
<donothingDR>, qc=0x7ffc44cf7530) at pquery.c:1321
#2 0x00005651d5c08b31 in PortalRun (portal=0x5651d79900a0, count=200,
isTopLevel=true, run_once=false,
dest=0x5651d7928378, altdest=0x5651d7928378, qc=0x7ffc44cf7530) at
pquery.c:779
#3 0x00005651d5c03fea in exec_execute_message
(portal_name=0x5651d7927f60 "", max_rows=200) at postgres.c:2196
#4 0x00005651d5c06d28 in PostgresMain (argc=1, argv=0x7ffc44cf7760,
dbname=0x5651d79235f8 "postgres",
username=0x5651d7953f58 "tmunro") at postgres.c:4405
#5 0x00005651d5b481d8 in BackendRun (port=0x5651d794d100) at postmaster.c:4484
#6 0x00005651d5b47b07 in BackendStartup (port=0x5651d794d100) at
postmaster.c:4206
#7 0x00005651d5b43f28 in ServerLoop () at postmaster.c:1730
#8 0x00005651d5b43777 in PostmasterMain (argc=3, argv=0x5651d79215b0)
at postmaster.c:1402
#9 0x00005651d5a43279 in main (argc=3, argv=0x5651d79215b0) at main.c:209
(gdb) f 1
#1 0x00005651d5c096c9 in PortalRunMulti (portal=0x5651d79900a0,
isTopLevel=true, setHoldSnapshot=false,
dest=0x5651d60a1580 <donothingDR>, altdest=0x5651d60a1580
<donothingDR>, qc=0x7ffc44cf7530) at pquery.c:1321
1321 if (lnext(portal->stmts, stmtlist_item) != NULL)
(gdb) print portal->stmts
$1 = (List *) 0x0

I didn't have time to investigate whether this is the right fix, but
this cargo cult change fixes the problem:

--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -1318,7 +1318,7 @@ PortalRunMulti(Portal portal,
* Increment command counter between queries, but not
after the last
* one.
*/
- if (lnext(portal->stmts, stmtlist_item) != NULL)
+ if (portal->stmts && lnext(portal->stmts,
stmtlist_item) != NULL)
CommandCounterIncrement();

Maybe something to do with commit
1cff1b95ab6ddae32faa3efe0d95a820dbfdc164. I can dig some more
tomorrow if someone doesn't beat me to it.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message mayur 2021-01-07 11:15:44 Re: BUG #16812: Logical decoding error
Previous Message PG Bug reporting form 2021-01-07 08:57:44 BUG #16812: Logical decoding error