From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org> |
Subject: | Re: wip: functions median and percentile |
Date: | 2010-09-21 20:28:15 |
Message-ID: | AANLkTi==JtfXUMBmhrmHyKe-25ieD5etJfeDpWAL8vnk@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-rrreviewers |
Hello
I found probably hard problem in cooperation with window functions :(
tuplesort_begin_datum creates child context inside aggcontext. This
context is used for tuplestore. But when this function is called from
WindowAgg_Aggregates context - someone drops all child context every
iteration, and then tuplestore state is invalid. For this moment we
can block using a median function as window function, but it should be
solved better - if it is possible - I don't see inside window function
implementation.
2010/9/20 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
> 2010/8/19 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>> Hello
>>
>> I am sending a prototype implementation of functions median and
>> percentile. This implementation is very simple and I moved it to
>> contrib for this moment - it is more easy maintainable. Later I'll
>> move it to core.
>
> I've reviewed this patch.
>
> * The patch can apply cleanly and make doesn't print any errors nor
> warnings. But it doesn't touch contrib/Makefile so I had to make by
> changing dir to contrib/median.
>
fixed
> * Cosmetic coding style should be more appropriate, including trailing
> white spaces and indentation positions.
>
can you specify where, please?
> * Since these two aggregates use tuplesort inside it, there're
> possible risk to cause out of memory in normal use case. I don't think
> this fact is critical, but at least some notation should be referred
> in docs.
>
it is same as windows function, no? So the risk is same.
> * It doesn't contain any document nor regression tests.
>
yes, it is my fault, some median regress test appended, documentation
I am will sending later - resp. if you agree with current form of
patch, I'll finalize this patch and remove "median" to core. I put
these functions to contrib just for simple testing of proof concept.
> * They should be callable in window function context; for example
>
> contrib_regression=# select median(a) over (order by a) from t1;
> ERROR: invalid tuplesort state
>
> or at least user-friend message should be printed.
>
the problem is in windows function implementation - partially fixed -
blocked from windows context
> * The returned type is controversy. median(int) returns float8 as the
> patch intended, but avg(int) returns numeric. AFAIK only avg(float8)
> returns float8.
>
fixed
> * percentile() is more problematic; first, the second argument for the
> aggregate takes N of "N%ile" as int, like 50 if you want 50%ile, but
> it doesn't care about negative values or more than 100. In addition,
> the second argument is taken at the first non-NULL value of the first
> argument, but the second argument is semantically constant. For
> example, for (1.. 10) value of a in table t1,
>
yes, I understand - I don't have a problem to modify it. Just this has
same behave as string_agg. This is again deeper problem - we cannot
ensure so some parameter of aggregation function will be a constant. -
so it cannot be well solved now :( Have you some idea? There isn't any
tool for checking if some parameter is constant or not.
I removed percentile now.
> contrib_regression=# select percentile(a, a * 10 order by a) from t1;
> percentile
> ------------
> 1
> (1 row)
>
> contrib_regression=# select percentile(a, a * 10 order by a desc) from t1;
> percentile
> ------------
> 10
> (1 row)
>
> and if the argument comes from the subquery which doesn't contain
> ORDER BY clause, you cannot predict the result.
>
> That's all of my quick review up to now.
>
> Regards,
>
> --
> Hitoshi Harada
>
Attachment | Content-Type | Size |
---|---|---|
median.diff | text/x-patch | 12.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-09-21 20:44:16 | Re: wip: functions median and percentile |
Previous Message | Tom Lane | 2010-09-21 20:23:45 | Re: .gitignore files, take two |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-09-21 20:44:16 | Re: wip: functions median and percentile |
Previous Message | Pavel Stehule | 2010-09-20 06:22:29 | Re: [HACKERS] wip: functions median and percentile |