From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | Re: SIGSEGV in BRIN autosummarize |
Date: | 2017-10-15 12:44:58 |
Message-ID: | efefda33-5fd9-0a77-6ae5-ca21dbd163aa@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 10/15/2017 03:56 AM, Justin Pryzby wrote:
> On Fri, Oct 13, 2017 at 10:57:32PM -0500, Justin Pryzby wrote:
...
>> It's a bit difficult to guess what went wrong from this backtrace. For
>> me gdb typically prints a bunch of lines immediately before the frames,
>> explaining what went wrong - not sure why it's missing here.
>
> Do you mean this ?
>
> ...
> Loaded symbols for /lib64/libnss_files-2.12.so
> Core was generated by `postgres: autovacuum worker process gtt '.
> Program terminated with signal 11, Segmentation fault.
> #0 pfree (pointer=0x298c740) at mcxt.c:954
> 954 (*context->methods->free_p) (context, pointer);
>
Yes. So either 'context' is bogus. Or 'context->methods' is bogus. Or
'context->methods->free_p' is bogus.
>> Perhaps some of those pointers are bogus, the memory was already pfree-d
>> or something like that. You'll have to poke around and try dereferencing
>> the pointers to find what works and what does not.
>>
>> For example what do these gdb commands do in the #0 frame?
>>
>> (gdb) p *(MemoryContext)context
>
> (gdb) p *(MemoryContext)context
> Cannot access memory at address 0x7474617261763a20
>
OK, this means the memory context pointer (tracked in the header of a
chunk) is bogus. There are multiple common ways how that could happen:
* Something corrupts memory (typically out-of-bounds write).
* The pointer got allocated in an incorrect memory context (which then
was released, and the memory was reused for new stuff).
* It's a use-after-free.
* ... various other possibilities ...
>
> I uploaded the corefile:
> http://telsasoft.com/tmp/coredump-postgres-autovacuum-brin-summarize.gz
>
Thanks, but I'm not sure that'll help, at this point. We already know
what happened (corrupted memory), we don't know "how". And core files
are mostly just "snapshots" so are not very useful in answering that :-(
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Vik Fearing | 2017-10-15 13:28:52 | Re: [PATCH] pageinspect function to decode infomasks |
Previous Message | Thomas Munro | 2017-10-15 10:48:34 | Re: oversight in EphemeralNamedRelation support |