From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | chris(dot)tessels(at)inergy(dot)nl, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #13985: Segmentation fault on PREPARE TRANSACTION |
Date: | 2016-02-24 21:52:21 |
Message-ID: | 20160224215221.bbbduc5nelq7tf6s@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 2016-02-24 17:52:37 -0300, Alvaro Herrera wrote:
> chris(dot)tessels(at)inergy(dot)nl wrote:
>
> > Core was generated by `postgres: mailinfo_ow mailinfo_ods 10.50.6.6(4188'.
> > Program terminated with signal 11, Segmentation fault.
> >
> > #0 MinimumActiveBackends (min=50) at procarray.c:2472
> > 2472 if (pgxact->xid == InvalidTransactionId)
>
> It's not surprising that you're not able to make this crash
> consistently, because it looks like the problem might be in concurrent
> modifications to the PGXACT array. This routine, MinimumActiveBackends,
> walks the PGPROC array explicitely without locks. There are comments
> indicating that this is safe, but evidently something has slipped in
> there.
>
> Apparently this code is trying to dereference an invalid pgxact, but
> it's not clear to me how this happens. Those structs are allocated in
> advance, and they are referenced in the code via array indexes, so even
> if the pgxact doesn't actually hold data about a valid transaction,
> dereferencing the XID shouldn't cause a crash.
Well, that code is pretty, uh, questionable. E.g. for
int pgprocno = arrayP->pgprocnos[index];
volatile PGPROC *proc = &allProcs[pgprocno];
volatile PGXACT *pgxact = &allPgXact[pgprocno];
there's no guarantee that pgprocno is actually the same index for both
lookups and the following
if (pgprocno == -1)
continue; /* do not count deleted entries */
check. It's perfectly reasonable for a compiler to reload pgprocno from
memory, or just always reference it via memory.
I presume what happened here is that initially arrayP->pgprocnos[index]
was -1, but by the time if (pgprocno == -1) is reached, it changed to a
different value.
It's also really crummy that we're doing the PGPROC/PGXACT lookups
before checking whether pgprocno is -1.
At the very least ISTM that we have to make pgprocno volatile (or use a
memory barrier - but we don't have sufficient support for those in the
older branches), and move the PGPROC/PGXACT lookups after the == -1
check.
Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Ramesh Rajamanickam | 2016-02-25 03:40:57 | Query-Sending mail from PostgresSQL |
Previous Message | Alvaro Herrera | 2016-02-24 21:42:14 | Re: BUG #13988: "plan should not reference subplan's variable" whilst using row level security |