From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Douglas Doole <dougdoole(at)gmail(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCH] Push limit to sort through a subquery |
Date: | 2017-08-25 16:05:22 |
Message-ID: | 3207.1503677122@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I had the same thought. Done in the attached revision of your
> version. I couldn't consistently reproduce the failure you saw -- it
> failed for me only about 1 time in 10 -- but I don't think it's
> failing any more with this version.
Hm. It seemed pretty reliable for me, but we already know that the
parallel machinery is quite timing-sensitive. I think we'd better
incorporate a regression test case into the committed patch so we
can see what the buildfarm thinks. I'll see about adding one.
> I think that the comment at the bottom of ExecSetTupleBound should
> probably be rethought a bit in light of these changes.
Agreed; I'll revisit all the comments, actually.
> The details aside, Konstantin Knizhnik's proposal to teach this code
> about ForeignScans is yet another independent way for this to win, and
> I think that will also be quite a large win.
+1
> It's possible that we should work out some of these things at plan
> time rather than execution time, and it's also possible that a
> separate call to ExecSetTupleBound is too much work. I've mulled over
> the idea of whether we should get rid of this mechanism altogether in
> favor of passing the tuple bound as an additional argument to
> ExecProcNode, because it seems silly to call node-specific code once
> to set a (normally small, possibly 1) bound and then call it again to
> get a (or the) tuple.
No, I think that's fundamentally the wrong idea. The tuple bound
inherently is a piece of information that has to apply across a whole
scan. If you pass it to ExecProcNode then you'll have to get into
API contracts about whether it's allowed to change across calls,
which is a mess, even aside from efficiency concerns (and for
ExecProcNode, I *am* concerned about efficiency).
You could imagine adding it as a parameter to ExecReScan, instead, but
then there are a bunch of issues with how that would interact with the
existing optimizations for skipped/delayed/merged rescan calls (cf.
the existing open issue with parallel rescans). That is a can of worms
I'd just as soon not open.
I'm content to leave it as a separate call. I still think that worrying
about extra C function calls for this purpose is, at best, premature
optimization.
I'll do another pass over the patch and get back to you. I've not
looked at the instrumentation patch yet --- is there a reason to
worry about it before we finish this one?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2017-08-25 16:16:30 | Re: Proposal: global index |
Previous Message | Robert Haas | 2017-08-25 15:43:51 | Re: [PATCH] Push limit to sort through a subquery |