Quick Links

Re: crashes due to setting max_parallel_workers=0

From:	David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: crashes due to setting max_parallel_workers=0
Date:	2017-03-27 16:11:14
Message-ID:	CAKJS1f_DJmH44o0ODDoJb=rAUeo5b39u9PTa_xFD6AuqyptBOA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 28 March 2017 at 04:57, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sat, Mar 25, 2017 at 12:18 PM, Rushabh Lathia
> <rushabh(dot)lathia(at)gmail(dot)com> wrote:
>> About the original issue reported by Tomas, I did more debugging and
>> found that - problem was gather_merge_clear_slots() was not returning
>> the clear slot when nreader is zero (means nworkers_launched = 0).
>> Due to the same scan was continue even all the tuple are exhausted,
>> and then end up with server crash at gather_merge_getnext(). In the patch
>> I also added the Assert into gather_merge_getnext(), about the index
>> should be less then the nreaders + 1 (leader).
>
> Well, you and David Rowley seem to disagree on what the fix is here.
> His patches posted upthread do A, and yours do B, and from a quick
> look those things are not just different ways of spelling the same
> underlying fix, but actually directly conflicting ideas about what the
> fix should be. Any chance you can review his patches, and maybe he
> can review yours, and we could try to agree on a consensus position?
> :-)

When comparing Rushabh's to my minimal patch, both change
gather_merge_clear_slots() to clear the leader's slot. My fix
mistakenly changes things so it does ExecInitExtraTupleSlot() on the
leader's slot, but seems that's not required since
gather_merge_readnext() sets the leader's slot to the output of
ExecProcNode(outerPlan). I'd ignore my minimal fix because of that
mistake. Rushabh's patch sidesteps this by adding a conditional
pfree() to not free slot that we didn't allocate in the first place.

I do think the code could be improved a bit. I don't like the way
GatherMergeState's nreaders and nworkers_launched are always the same.
I think this all threw me off a bit and may have been the reason for
the bug in the first place.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: crashes due to setting max_parallel_workers=0 at 2017-03-27 15:57:35 from Robert Haas

Responses

Re: crashes due to setting max_parallel_workers=0 at 2017-03-27 16:22:29 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kevin Grittner	2017-03-27 16:16:11	Re: Guidelines for GSoC student proposals / Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions
Previous Message	Mike Palmiotto	2017-03-27 16:09:08	Re: partitioned tables and contrib/sepgsql