From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Hadi Moshayedi <hadi(at)moshayedi(dot)net> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, mark(dot)kirkwood(at)catalyst(dot)net(dot)nz |
Subject: | Re: Improving avg performance for numeric |
Date: | 2013-03-18 15:25:37 |
Message-ID: | CAFj8pRCNkmt_92BxbMEDV8W+83k4yeCJuAtM4E8vCdv3tUC9oA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello
I played with sum(numeric) optimization
Now it is based on generic numeric_add function - this code is
relative old - now we can design aggregates with internal transition
buffers, and probably we can do this work more effective.
I just removed useles palloc/free operations and I got a 30% better
performance! My patch is ugly - because I used a generic add_var
function. Because Sum, Avg and almost all aggregates functions is
limited by speed of sum calculation I thing so we need a new numeric
routines optimized for calculation "sum", that use a only preallocated
buffers. A speed of numeric is more important now, because there are
more and more warehouses, where CPU is botleneck.
Regards
Pavel
2013/3/18 Hadi Moshayedi <hadi(at)moshayedi(dot)net>:
> Hi Pavel,
>
> Thanks a lot for your feedback.
>
> I'll work more on this patch this week, and will send a more complete patch
> later this week.
>
> I'll also try to see how much is the speed up of this method for other
> types.
>
> Thanks,
> -- Hadi
>
>
> On Mon, Mar 18, 2013 at 10:36 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>>
>> 2013/3/16 Hadi Moshayedi <hadi(at)moshayedi(dot)net>:
>> > Revisiting:
>> > http://www.postgresql.org/message-id/45661BE7.4050205@paradise.net.nz
>> >
>> > I think the reasons which the numeric average was slow were:
>> > (1) Using Numeric for count, which is slower than int8 to increment,
>> > (2) Constructing/deconstructing arrays at each transition step.
>> >
>> > This is also discussed at:
>> >
>> > http://www.citusdata.com/blog/53-postgres-performance-to-avg-or-to-sum-divided-by-count
>> >
>> > So, I think we can improve the speed of numeric average by keeping the
>> > transition state as an struct in the aggregate context, and just passing
>> > the
>> > pointer to that struct from/to the aggregate transition function.
>> >
>> > The attached patch uses this method.
>> >
>> > I tested it using the data generated using:
>> > CREATE TABLE avg_test AS SELECT (random() * 999)::decimal(5,2) as d FROM
>> > generate_series(1, 10000000) s;
>> >
>> > After applying this patch, run time of "SELECT avg(d) FROM avg_test;"
>> > improves from 10.701 seconds to 5.204 seconds, which seems to be a huge
>> > improvement.
>> >
>> > I think we may also be able to use a similar method to improve
>> > performance
>> > of some other numeric aggregates (like stddev). But I want to know your
>> > feedback first.
>> >
>> > Is this worth working on?
>>
>> I checked this patch and it has a interesting speedup - and a price of
>> this methoud should not be limited to numeric type only
>>
>> Pavel
>>
>> >
>> > Thanks,
>> > -- Hadi
>> >
>> >
>> >
>> > --
>> > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> > To make changes to your subscription:
>> > http://www.postgresql.org/mailpref/pgsql-hackers
>> >
>
>
Attachment | Content-Type | Size |
---|---|---|
numeric-sum-optimize.patch | application/octet-stream | 13.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Singer | 2013-03-18 16:08:09 | pg_upgrade segfaults when given an invalid PGSERVICE value |
Previous Message | Dimitri Fontaine | 2013-03-18 15:05:14 | Re: in-catalog Extension Scripts and Control parameters (templates?) |