From: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
---|---|
To: | Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Aggregate ORDER BY patch |
Date: | 2009-11-15 23:23:34 |
Message-ID: | 871vjzl82a.fsf@news-spur.riddles.org.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>>>>> "Andrew" == Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
Andrew> Performance.
Andrew> tuplesort_getdatum etc. seems to be substantially faster than
Andrew> tuplesort_gettupleslot especially for the case where you're
Andrew> sorting a pass-by-value datum such as an integer (since the
Andrew> datum is then stored only in the sort tuple header and
Andrew> doesn't require a separate space allocation for
Andrew> itself). Using a slot in all cases would have slowed down
Andrew> some common cases like count(distinct id) by a measurable
Andrew> amount.
Andrew> Cases like array_agg(x order by x) benefit from the faster
Andrew> code path too.
Andrew> The memory management between the two cases is sufficiently
Andrew> different that combining them into one function while still
Andrew> maintaining the slot vs. datum distinction would be ugly and
Andrew> probably error-prone. The relatively minor duplication of
Andrew> logic seemed much clearer to me.
Just to quantify this, using a production-quality build (optimized and
without assertions), it turns out that the fast code path
(process_ordered_aggregate_single) is faster by 300% for pass-by-value
types, and by approximately 20% for short values of pass-by-reference
types, as compared to disabling that code path and forcing even the
one-arg case to use the slot interface.
So using the slot interface for everything would have constituted a
300% slowdown over the older code for count(distinct id), obviously
undesirable.
As it stands, I can't detect any performance regression over the
previous code.
This means that agg(x order by y) is rather noticably slower than
agg(x order by x), but this is pretty much unavoidable given how the
sorting code works.
Future performance enhancements (which I have no particular plans to
tackle) would involve having the planner consult the desired aggregate
orderings and estimating the cost of sorting as opposed to the cost of
producing a plan with the input already ordered. Also combining the
sort step for aggregates that share a single ordering.
--
Andrew (irc:RhodiumToad)
From | Date | Subject | |
---|---|---|---|
Next Message | David E. Wheeler | 2009-11-15 23:34:31 | Re: Summary and Plan for Hot Standby |
Previous Message | Robert Haas | 2009-11-15 23:22:44 | Re: Summary and Plan for Hot Standby |