Quick Links

Re: SegFault on 9.6.14

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jerry Sievers <gsievers19(at)comcast(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SegFault on 9.6.14
Date:	2019-11-18 08:52:44
Message-ID:	CAA4eK1KkhPSt8wxvoQgFRA1dePs8JvTHdTjq=vyZqEmsAn3c-g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Oct 18, 2019 at 10:08 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Oct 17, 2019 at 10:51 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> >
> > ===
> > Don't shut down Gather[Merge] early under Limit.
> >
> > Revert part of commit 19df1702f5.
> >
> > Early shutdown was added by that commit so that we could collect
> > statistics from workers, but unfortunately it interacted badly with
> > rescans. Rescanning a Limit over a Gather node could produce a SEGV
> > on 9.6 and resource leak warnings on later releases. By reverting the
> > early shutdown code, we might lose statistics in some cases of Limit
> > over Gather, but that will require further study to fix.
> >
>
> How about some text like below? I have added slightly different text
> to explain the reason for the problem.
>
> "Early shutdown was added by that commit so that we could collect
> statistics from workers, but unfortunately, it interacted badly with
> rescans. The problem is that we ended up destroying the parallel
> context which is required for rescans. This leads to rescans of a
> Limit node over a Gather node to produce unpredictable results as it
> tries to access destroyed parallel context. By reverting the early
> shutdown code, we might lose statistics in some cases of Limit over
> Gather, but that will require further study to fix."
>
> I am not sure but we can even add a comment in the place where we are
> removing some code (in nodeLimit.c) to indicate that 'Ideally we
> should shutdown parallel resources here to get the correct stats, but
> that would lead to rescans misbehaving when there is a Gather [Merge]
> node beneath it. (Explain the reason for misbehavior and the ideas we
> discussed in this thread to fix the same) ........."
>
> I can try to come up with comments in nodeLimit.c on the above lines
> if we think that is a good idea?
>

I have modified the commit message as proposed above and additionally
added comments in nodeLimit.c. I think we should move ahead with this
bug-fix patch. If we don't like the comment, it can anyway be
improved later.

Any suggestions?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment	Content-Type	Size
0001-Don-t-shut-down-Gather-Merge-early-under-Limit.patch	application/octet-stream	5.4 KB

In response to

Re: SegFault on 9.6.14 at 2019-10-18 04:38:46 from Amit Kapila

Responses

Re: SegFault on 9.6.14 at 2019-11-20 11:42:04 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Langote	2019-11-18 08:53:11	Re: adding partitioned tables to publications
Previous Message	Jinbao Chen	2019-11-18 06:48:17	Planner chose a much slower plan in hashjoin, using a large table as the inner table.