From: | Amit Kapila <amit(dot)kapila(at)huawei(dot)com> |
---|---|
To: | 'Peter Geoghegan' <peter(at)2ndquadrant(dot)com>, 'PG Hackers' <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Inlining comparators as a performance optimisation |
Date: | 2011-09-20 13:57:24 |
Message-ID: | 595E409C76394413878F7F572F5DE0CB@china.huawei.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>Recent discussions on the threads "Double sorting split patch" and
>"CUDA sorting" raised the possibility that there could be significant
>performance optimisation "low-hanging fruit" picked by having the
>executor treat integers and floats as a special case during sorting,
>avoiding going to the trouble of calling a comparator using the
>built-in SQL function machinery
Why only for integers and floats why not for char/varchar?
But I believe this can make code less maintainable as similar things can be
done at other places to avoid SQL function machinery.
>Once the cache has been warmed, explain analyze very consistently
>reports a runtime of 123ms for this query on master/HEAD, which varies
>+/- 1 ms, with a few outliers of maybe +/- 2ms. However, when I apply
>this patch, that goes down to 107ms +/- 1ms at -O0.
Time 123ms which is without your change is with which optimization -O2 or
O0?
****************************************************************************
***********
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!
-----Original Message-----
From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Peter Geoghegan
Sent: Tuesday, September 20, 2011 7:26 AM
To: PG Hackers
Subject: [HACKERS] Inlining comparators as a performance optimisation
Recent discussions on the threads "Double sorting split patch" and
"CUDA sorting" raised the possibility that there could be significant
performance optimisation "low-hanging fruit" picked by having the
executor treat integers and floats as a special case during sorting,
avoiding going to the trouble of calling a comparator using the
built-in SQL function machinery, and taking advantage of inlining of
the comparator, which has been shown to have a considerable
performance advantage (at least compared to a general purpose c stdlib
qsort(), that takes a function pointer as its comparator, much like
tuplesort).
I've hacked together a sloppy POC implementation in a hurry
(basically, some code is shifted around) , which is attached - I felt
that it would be useful to determine if the community feels that this
is a worth-while undertaking in advance of a business trip that I'm
leaving on tomorrow lasting until Friday, during which I will be
mostly unavailable. The patch breaks the Postgres sorting executor
node (at least when it quicksorts) for any type other than int4. I
apologise for how rough the patch is, but the code itself isn't
important right now - the idea is. I anticipate that the value
state->datumType or something similar will be set in a near future
revision, so that tuplesort_performsort will know which type-specific
optimisation it can use for the type, while falling back on the
existing generic qsort_arg + qsort_arg_comparator, and sorting won't
actually be broken.
I've been doing some preliminary testing using the dell store 2 sample
database. I increase work_mem to '50MB', to ensure that a quicksort
will be performed for sorting (otherwise, I'm using the
postgresql.conf that initdb gave me). The query is:
explain analyze select * from orderlines order by prod_id;
Once the cache has been warmed, explain analyze very consistently
reports a runtime of 123ms for this query on master/HEAD, which varies
+/- 1 ms, with a few outliers of maybe +/- 2ms. However, when I apply
this patch, that goes down to 107ms +/- 1ms at -O0. I think that
that's a pretty good start. Funnily enough, the difference/advantage
vanishes at -O2 (I'm guessing that the higher optimisation level of
GCC 4.5 hyper-corrects away the inlining, but I don't have time to
check that right now).
I imagine the version that I actually submit for patch review will
have a macro-based infrastructure for inlining the sorting of various
built-in types, initially integers and floats. It will most likely
have some other optimisations - I haven't even used a profiler yet.
This performance patch differs from most in that it's difficult in
principle to imagine a performance regression occurring.
Thoughts?
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2011-09-20 14:12:03 | Re: Is there really no interest in SQL Standard? |
Previous Message | Peter Eisentraut | 2011-09-20 13:51:51 | Re: Is there really no interest in SQL Standard? |