From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Subject: | Re: [HACKERS] why not parallel seq scan for slow functions |
Date: | 2018-03-28 21:01:20 |
Message-ID: | CA+TgmobwhrBcYNfo2y6sddUS4vgkaUs4M0vkdXxLTbi8+f-fmQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 28, 2018 at 3:06 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Good idea, such an optimization will ensure that the cases reported
> above will not have regression. However isn't it somewhat beating the
> idea you are using in the patch which is to avoid modifying the path
> in-place?
Sure, but you can't have everything. I don't think modifying the
sortgroupref data in place is really quite the same thing as changing
the pathtarget in place; the sortgroupref stuff is some extra
information about the targets being computed, not really a change in
targets per se. But in any case if you want to eliminate extra work
then we've gotta eliminate it...
> In any case, I think one will still see regression in cases
> where this optimization doesn't apply. For example,
>
> DO $$
> DECLARE count integer;
> BEGIN
> For count In 1..1000000 Loop
> Execute 'explain Select sum(thousand)from tenk1 group by ten';
> END LOOP;
> END;
> $$;
>
> The above block takes 43700.0289 ms on Head and 45025.3779 ms with the
> patch which is approximately 3% regression.
Thanks for the analysis -- the observation that this seemed to affect
cases where CP_LABEL_TLIST gets passed to create_projection_plan
allowed me to recognize that I was doing an unnecessary copyObject()
call. Removing that seems to have reduced this regression below 1% in
my testing.
Also, keep in mind that we're talking about extremely small amounts of
time here. On a trivial query that you're not even executing, you're
seeing a difference of (45025.3779 - 43700.0289)/1000000 = 0.00132 ms
per execution. Sure, it's still 3%, but it's 3% of the time in an
artificial case where you don't actually run the query. In real life,
nobody runs EXPLAIN in a tight loop a million times without ever
running the query, because that's not a useful thing to do. The
overhead on a realistic test case will be smaller. Furthermore, at
least in my testing, there are now cases where this is faster than
master. Now, I welcome further ideas for optimization, but a patch
that makes some cases slightly slower while making others slightly
faster, and also improving the quality of plans in some cases, is not
to my mind an unreasonable thing.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
0003-Rewrite-the-code-that-applies-scan-join-targets-to-p.patch | application/octet-stream | 62.8 KB |
0002-Postpone-generate_gather_paths-for-topmost-scan-join.patch | application/octet-stream | 6.2 KB |
0001-Teach-create_projection_plan-to-omit-projection-wher.patch | application/octet-stream | 7.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2018-03-28 21:09:07 | Re: reorganizing partitioning code |
Previous Message | David G. Johnston | 2018-03-28 20:32:46 | Re: csv format for psql |