Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-21 13:35:57
Message-ID: 407d949e0912210535l298b68ecu50a94d1a873b86e4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 21, 2009 at 11:43 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> If we don't find a way to optimize this for on-disk tuplestores though
>> then I wonder whether it's worth it. It would only help in the narrow
>> case that you had a large enough set to see the difference between
>> doing quickselect and quicksort but small enough to fit in workmem. I
>> suppose that case is widening over time though as memory sizes get
>> bigger and bigger.
>
> Thank you
>
> I have to do same test and get more info about quickselect

So my first thought of how to do median efficiently on large on-disk
data structures is to do a first pass, attempting to find as narrow a
target partition as possible that the median definitely falls in. Then
do a second pass counting how many tuples fall before and after the
target partition and accumulating all the tuples in the range and
perform quickselect to find the right tuple in the target partition. I
do wonder

To find a target partition you could use quickselect itself to find a
set of medians but I think it will boil down to a kind of huffman-like
tree encoding. You want to accumulate values as degenerate singleton
partitions until they don't fit in work_mem. When you exhaust work_mem
you collapse two adjacent "partitions" into a single partition
covering the whole range with a count of 2. You keep going collapsing
the two lowest count ranges each time (perhaps preferring to collapse
ranges far from the current median?). When you're done you can easily
determine which range contains the median. You then have to scan the
whole original input again skipping any tuple that falls outside that
partition. If it still doesn't fit in work_mem you have to repeat the
algorithm.

This looks like a lot of code for a narrow use case though. And it
would do a minimum of two passes through the input which is going to
mean it'll perform similarly to our regular tape sort until the inputs
get very large and tapesort needs to do multiple passes. At that point
it's possible this algorithm might need multiple passes as well,
though hopefully of smaller and smaller sets. I thought I would put it
down in an email in case anyone's interested in experimenting though.

Also, I assume I've just reinvented some standard algorithm I should
remember from school, though I can't tell which offhand. It looks kind
of like mergesort except since we're just doing selection we don't
need to actually do the merge step, just keep track of which values we
would have merged.

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafael Martinez 2009-12-21 13:36:00 Table size does not include toast size
Previous Message Martin Pihlak 2009-12-21 12:59:30 Re: fdw validation function vs zero catalog id