From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: wip: functions median and percentile |
Date: | 2010-10-11 09:55:16 |
Message-ID: | AANLkTingNyxYVhn_Hi9Wbs4TJvetmiKNUjambgQ2ppLj@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-rrreviewers |
2010/10/11 Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>:
> On 10 October 2010 22:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It was pointed out upthread that while median isn't presently
>> in the standard, Oracle defines it in terms of percentile_cont(0.5)
>> which *is* in the standard. What I read in SQL:2008 is that
>> percentile_cont is defined for all numeric types (returning
>> approximate numeric with implementation-defined precision),
>> and for interval (returning interval), and not for any other
>> input type. So it appears to me that what we ought to support
>> is
>> median(float8) returns float8
>> median(interval) returns interval
>> and nothing else --- we can rely on implicit casting to convert
>> any other numeric input type to float8.
>>
>
> Yeah that would be much simpler.
>
> BTW, why has percentile been removed from this patch? As the more
> general, and SQL standard function, that would seem to be the more
> useful one to include. Upthread it was mentioned that there is already
> an ntile window function, but actually that's a completely different
> thing.
The reason for removing was impossibility to specify so some parameter
must by immutable - in this case p parameter should be immutable
otherwise the result is undefined.
Regards
Pavel Stehule
>
>
>> BTW, as far as the implementation issues go, telling tuplesort that it
>> can use gigabytes of memory no matter what seems quite unacceptable.
>> Put this thing into a hash aggregation and you'll blow out your memory
>> in no time. I don't think it's even a good idea to use work_mem there.
>
> Argh! Yes that sounds like a much more serious problem.
>
> Interestingly I couldn't seem to produce this effect. Every effort I
> make to write a query to test this with median ends up being executed
> using a GroupAggregate, while the equivalent query with avg uses a
> HashAggregate. I don't understand why they are being treated
> differently.
>
>
>> I wonder whether it'd be a good idea to augment AggCheckCallContext()
>> so that there's a way for aggregates to find out how much memory they
>> ought to try to use. In a simple aggregation situation it's probably
>> OK to use work_mem, but in a hash aggregation you'd better use less
>> --- perhaps work_mem divided by the number of groups expected.
>
> Wouldn't that risk not allowing any memory at all the to aggregate in
> some cases? I don't have a better idea mind you, short of somehow not
> allowing hash aggregation for this function.
>
>
>> Also, I believe that the lack-of-cleanup problem for tuplesorts spilling
>> to disk should be fixable by using an exprcontext shutdown callback (see
>> RegisterExprContextCallback).
>
> Ah! I wasn't aware of such a callback. Sounds perfect for the job.
>
> Regards,
> Dean
>
>
>>
>> Comments?
>>
>> regards, tom lane
>>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Pierre C | 2010-10-11 10:09:02 | Re: Slow count(*) again... |
Previous Message | Dean Rasheed | 2010-10-11 09:35:13 | Re: wip: functions median and percentile |
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2010-10-11 10:33:11 | Re: wip: functions median and percentile |
Previous Message | Dean Rasheed | 2010-10-11 09:35:13 | Re: wip: functions median and percentile |