From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jerry Sievers <gsievers19(at)comcast(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: SegFault on 9.6.14 |
Date: | 2019-11-18 08:52:44 |
Message-ID: | CAA4eK1KkhPSt8wxvoQgFRA1dePs8JvTHdTjq=vyZqEmsAn3c-g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Oct 18, 2019 at 10:08 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Oct 17, 2019 at 10:51 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> >
> > ===
> > Don't shut down Gather[Merge] early under Limit.
> >
> > Revert part of commit 19df1702f5.
> >
> > Early shutdown was added by that commit so that we could collect
> > statistics from workers, but unfortunately it interacted badly with
> > rescans. Rescanning a Limit over a Gather node could produce a SEGV
> > on 9.6 and resource leak warnings on later releases. By reverting the
> > early shutdown code, we might lose statistics in some cases of Limit
> > over Gather, but that will require further study to fix.
> >
>
> How about some text like below? I have added slightly different text
> to explain the reason for the problem.
>
> "Early shutdown was added by that commit so that we could collect
> statistics from workers, but unfortunately, it interacted badly with
> rescans. The problem is that we ended up destroying the parallel
> context which is required for rescans. This leads to rescans of a
> Limit node over a Gather node to produce unpredictable results as it
> tries to access destroyed parallel context. By reverting the early
> shutdown code, we might lose statistics in some cases of Limit over
> Gather, but that will require further study to fix."
>
> I am not sure but we can even add a comment in the place where we are
> removing some code (in nodeLimit.c) to indicate that 'Ideally we
> should shutdown parallel resources here to get the correct stats, but
> that would lead to rescans misbehaving when there is a Gather [Merge]
> node beneath it. (Explain the reason for misbehavior and the ideas we
> discussed in this thread to fix the same) ........."
>
> I can try to come up with comments in nodeLimit.c on the above lines
> if we think that is a good idea?
>
I have modified the commit message as proposed above and additionally
added comments in nodeLimit.c. I think we should move ahead with this
bug-fix patch. If we don't like the comment, it can anyway be
improved later.
Any suggestions?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
0001-Don-t-shut-down-Gather-Merge-early-under-Limit.patch | application/octet-stream | 5.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2019-11-18 08:53:11 | Re: adding partitioned tables to publications |
Previous Message | Jinbao Chen | 2019-11-18 06:48:17 | Planner chose a much slower plan in hashjoin, using a large table as the inner table. |