Re: BUG #13985: Segmentation fault on PREPARE TRANSACTION

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: chris(dot)tessels(at)inergy(dot)nl, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13985: Segmentation fault on PREPARE TRANSACTION
Date: 2016-02-24 21:52:21
Message-ID: 20160224215221.bbbduc5nelq7tf6s@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2016-02-24 17:52:37 -0300, Alvaro Herrera wrote:
> chris(dot)tessels(at)inergy(dot)nl wrote:
>
> > Core was generated by `postgres: mailinfo_ow mailinfo_ods 10.50.6.6(4188'.
> > Program terminated with signal 11, Segmentation fault.
> >
> > #0 MinimumActiveBackends (min=50) at procarray.c:2472
> > 2472 if (pgxact->xid == InvalidTransactionId)
>
> It's not surprising that you're not able to make this crash
> consistently, because it looks like the problem might be in concurrent
> modifications to the PGXACT array. This routine, MinimumActiveBackends,
> walks the PGPROC array explicitely without locks. There are comments
> indicating that this is safe, but evidently something has slipped in
> there.
>
> Apparently this code is trying to dereference an invalid pgxact, but
> it's not clear to me how this happens. Those structs are allocated in
> advance, and they are referenced in the code via array indexes, so even
> if the pgxact doesn't actually hold data about a valid transaction,
> dereferencing the XID shouldn't cause a crash.

Well, that code is pretty, uh, questionable. E.g. for
int pgprocno = arrayP->pgprocnos[index];
volatile PGPROC *proc = &allProcs[pgprocno];
volatile PGXACT *pgxact = &allPgXact[pgprocno];
there's no guarantee that pgprocno is actually the same index for both
lookups and the following
if (pgprocno == -1)
continue; /* do not count deleted entries */
check. It's perfectly reasonable for a compiler to reload pgprocno from
memory, or just always reference it via memory.

I presume what happened here is that initially arrayP->pgprocnos[index]
was -1, but by the time if (pgprocno == -1) is reached, it changed to a
different value.

It's also really crummy that we're doing the PGPROC/PGXACT lookups
before checking whether pgprocno is -1.

At the very least ISTM that we have to make pgprocno volatile (or use a
memory barrier - but we don't have sufficient support for those in the
older branches), and move the PGPROC/PGXACT lookups after the == -1
check.

Andres

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Ramesh Rajamanickam 2016-02-25 03:40:57 Query-Sending mail from PostgresSQL
Previous Message Alvaro Herrera 2016-02-24 21:42:14 Re: BUG #13988: "plan should not reference subplan's variable" whilst using row level security