From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org> |
Subject: | Re: wip: functions median and percentile |
Date: | 2010-09-23 13:25:32 |
Message-ID: | AANLkTikYGMCb9E0M_X5nv6Y_=buotHUda_B9zazmDB9L@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-rrreviewers |
sorry
little bit fixed patch
Pavel
2010/9/23 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> Hello
>
> I moved a "median" function to core.
>
> + doc part
> + regress test
>
> Regards
>
> Pavel Stehule
>
>
> 2010/9/20 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
>> 2010/8/19 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>>> Hello
>>>
>>> I am sending a prototype implementation of functions median and
>>> percentile. This implementation is very simple and I moved it to
>>> contrib for this moment - it is more easy maintainable. Later I'll
>>> move it to core.
>>
>> I've reviewed this patch.
>>
>> * The patch can apply cleanly and make doesn't print any errors nor
>> warnings. But it doesn't touch contrib/Makefile so I had to make by
>> changing dir to contrib/median.
>>
>> * Cosmetic coding style should be more appropriate, including trailing
>> white spaces and indentation positions.
>>
>> * Since these two aggregates use tuplesort inside it, there're
>> possible risk to cause out of memory in normal use case. I don't think
>> this fact is critical, but at least some notation should be referred
>> in docs.
>>
>> * It doesn't contain any document nor regression tests.
>>
>> * They should be callable in window function context; for example
>>
>> contrib_regression=# select median(a) over (order by a) from t1;
>> ERROR: invalid tuplesort state
>>
>> or at least user-friend message should be printed.
>>
>> * The returned type is controversy. median(int) returns float8 as the
>> patch intended, but avg(int) returns numeric. AFAIK only avg(float8)
>> returns float8.
>>
>> * percentile() is more problematic; first, the second argument for the
>> aggregate takes N of "N%ile" as int, like 50 if you want 50%ile, but
>> it doesn't care about negative values or more than 100. In addition,
>> the second argument is taken at the first non-NULL value of the first
>> argument, but the second argument is semantically constant. For
>> example, for (1.. 10) value of a in table t1,
>>
>> contrib_regression=# select percentile(a, a * 10 order by a) from t1;
>> percentile
>> ------------
>> 1
>> (1 row)
>>
>> contrib_regression=# select percentile(a, a * 10 order by a desc) from t1;
>> percentile
>> ------------
>> 10
>> (1 row)
>>
>> and if the argument comes from the subquery which doesn't contain
>> ORDER BY clause, you cannot predict the result.
>>
>> That's all of my quick review up to now.
>>
>> Regards,
>>
>> --
>> Hitoshi Harada
>>
>
Attachment | Content-Type | Size |
---|---|---|
median.diff | text/x-patch | 16.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-09-23 13:29:45 | Re: Path question |
Previous Message | Robert Haas | 2010-09-23 13:23:43 | Re: Latch implementation |
From | Date | Subject | |
---|---|---|---|
Next Message | Hitoshi Harada | 2010-09-23 17:45:36 | Re: wip: functions median and percentile |
Previous Message | Pavel Stehule | 2010-09-23 13:22:39 | Re: wip: functions median and percentile |