From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Christoph Berg <myon(at)debian(dot)org> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: gs_group_1 crashing on 13beta2/s390x |
Date: | 2020-07-15 21:45:35 |
Message-ID: | 3176347.1594849535@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Christoph Berg <myon(at)debian(dot)org> writes:
>> On the Debian s390x buildd, the 13beta2 build is crashing:
> I wired gdb into the build process and got this backtrace:
> #0 datumCopy (typByVal=false, typLen=-1, value=0) at ./build/../src/backend/utils/adt/datum.c:142
> vl = 0x0
> res = <optimized out>
> res = <optimized out>
> vl = <optimized out>
> eoh = <optimized out>
> resultsize = <optimized out>
> resultptr = <optimized out>
> realSize = <optimized out>
> resultptr = <optimized out>
> realSize = <optimized out>
> resultptr = <optimized out>
> #1 datumCopy (value=0, typByVal=false, typLen=-1) at ./build/../src/backend/utils/adt/datum.c:131
> res = <optimized out>
> vl = <optimized out>
> eoh = <optimized out>
> resultsize = <optimized out>
> resultptr = <optimized out>
> realSize = <optimized out>
> resultptr = <optimized out>
> #2 0x000002aa04423af8 in finalize_aggregate (aggstate=aggstate(at)entry=0x2aa05775920, peragg=peragg(at)entry=0x2aa056e02f0, resultVal=0x2aa056e0208, resultIsNull=0x2aa056e022a, pergroupstate=<optimized out>, pergroupstate=<optimized out>) at ./build/../src/backend/executor/nodeAgg.c:1128
Hmm. If gdb isn't lying to us, that has to be coming from here:
/*
* If result is pass-by-ref, make sure it is in the right context.
*/
if (!peragg->resulttypeByVal && !*resultIsNull &&
!MemoryContextContains(CurrentMemoryContext,
DatumGetPointer(*resultVal)))
*resultVal = datumCopy(*resultVal,
peragg->resulttypeByVal,
peragg->resulttypeLen);
The line numbers in HEAD are a bit different, but that's the only
call of datumCopy() in finalize_aggregate().
It's hardly surprising that datumCopy would segfault when given
a null "value" and told it is pass-by-reference. However, to get to
the datumCopy call, we must have passed the MemoryContextContains
check on that very same pointer value, and that would surely have
segfaulted as well, one would think.
Given the apparently-can't-happen situation at the call site,
and the fact that we're not seeing similar failures reported
elsewhere (and note that every line shown above is at least
five years old), I'm kind of forced to the conclusion that this
is a compiler bug. Does adjusting the -O level make it go away?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2020-07-15 21:46:03 | Re: Generic Index Skip Scan |
Previous Message | Tom Lane | 2020-07-15 20:40:59 | Re: Warn when parallel restoring a custom dump without data offsets |