Re: [sqlsmith] Parallel worker executor crash on master

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andreas Seltenreich <seltenreich(at)gmx(dot)de>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [sqlsmith] Parallel worker executor crash on master
Date: 2017-12-16 20:30:21
Message-ID: CAEepm=0RfCKGwwfO0LK_w0hF_+HVA84M_UCiMk8uOO0i6nW9SQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 16, 2017 at 10:13 PM, Andreas Seltenreich
<seltenreich(at)gmx(dot)de> wrote:
> Amit Kapila writes:
>
>> This seems to be another symptom of the problem related to
>> es_query_dsa for which Thomas has sent a patch on a different thread
>> [1]. After applying that patch, I am not able to see the problem. I
>> think due to the wrong usage of dsa across nodes, it can lead to
>> sending some wrong values for params to workers.
>>
>> [1] - https://www.postgresql.org/message-id/CAEepm%3D0Mv9BigJPpribGQhnHqVGYo2%2BkmzekGUVJJc9Y_ZVaYA%40mail.gmail.com
>
> while my posted recipe is indeed inconspicuous with the patch applied,
> It seems to have made matters worse from the sqlsmith perspective:
> Instead of one core dump per hour I get one per minute. Sample
> backtrace below. I could not find a recipe yet to reproduce these
> (beyond starting sqlsmith).
>
> regards,
> Andreas
>
> Core was generated by `postgres: smith regression [local] SELECT '.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 gather_getnext (gatherstate=0x555a5fff1350) at nodeGather.c:283
> 283 estate->es_query_dsa = gatherstate->pei->area;
> #1 ExecGather (pstate=0x555a5fff1350) at nodeGather.c:216

Hmm, thanks. That's not good. Do we know if gatherstate->pei is
NULL, or if it's somehow pointing to garbage? Not sure how either of
those things could happen, since we only set it to NULL in
ExecShutdownGather() after which point we shouldn't call ExecGather()
again, and any MemoryContext problems with pei should have caused
problems already without this patch (for example in
ExecParallelCleanup). Clearly I'm missing something.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Seltenreich 2017-12-16 23:26:45 Re: [sqlsmith] Parallel worker executor crash on master
Previous Message Justin Pryzby 2017-12-16 19:18:38 Re: Bitmap scan is undercosted? - overestimated correlation and cost_index