From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, "Adam, Etienne (Nokia-TECH/Issy Les Moulineaux)" <etienne(dot)adam(at)nokia(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>, "Duquesne, Pierre (Nokia-TECH/Issy Les Moulineaux)" <pierre(dot)duquesne(at)nokia(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] [postgresql 10 beta3] unrecognized node type: 90 |
Date: | 2017-08-30 18:16:14 |
Message-ID: | 18224.1504116974@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> On Tue, Aug 29, 2017 at 10:05 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> If no objections, I'll do the additional legwork and push.
> No objections.
Done. Out of curiosity, I pushed just the rescan-param patch to the
buildfarm to start with, to see if anything would fall over, and indeed
some things did:
* prairiedog has shown several instances of a parallel bitmap heap scan
test failing with too many rows being retrieved. I think what's
happening there is that the leader's ExecReScanBitmapHeapScan call is
slow enough to happen that the worker(s) have already retrieved some rows
using the old shared state. We'd determined that the equivalent case
for a plain seqscan would result in no failure because the workers would
think they had nothing to do, but this evidently isn't true for a parallel
bitmap scan.
* prairiedog and loach have both shown failures with the test case from
a2b70c89c, in which the *first* scan produces too many rows and then the
later ones are fine. This befuddled me initially, but then I remembered
that nodeNestloop.c will unconditionally do an ExecReScan call on its
inner plan before the first ExecProcNode call. With the modified code
from 7df2c1f8d, this results in the leader's Gather node's top child
having a pending rescan on it due to a chgParam bit. That's serviced
when we do the first ExecProcNode call on the child, after having started
the workers. So that's another way in which a ReScan call can happen
in the leader when workers are already running, and if the workers have
already scanned some pages then those pages will get scanned again.
So I think this is all fixed up by 41b0dd987, but evidently those patches
are not nearly as independent as I first thought.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | fp9 | 2017-08-30 18:55:31 | BUG #14793: PG Admin Silent install |
Previous Message | Tom Lane | 2017-08-30 17:56:51 | Re: BUG #14791: Error 42P07 but the relation DOESN'T Exists! Error 42P01 |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2017-08-30 18:54:26 | Re: Polyphase merge is obsolete |
Previous Message | Ashutosh Bapat | 2017-08-30 16:47:38 | Re: expanding inheritance in partition bound order |