From: | Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> |
---|---|
To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org> |
Subject: | Re: wip: functions median and percentile |
Date: | 2010-09-23 17:45:36 |
Message-ID: | AANLkTinBwHgSEzf_iT-jkjZeMwasqDo=DF+nbXQKZ6Km@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-rrreviewers |
2010/9/23 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> Hello
>
> 2010/9/22 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
>> 2010/9/22 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>>> Hello
>>>
>>> I found probably hard problem in cooperation with window functions :(
>
> maybe I was confused. I found a other possible problems.
>
> The problem with median function is probably inside a final function
> implementation. Actually we request possibility of repetitive call of
> final function. But final function call tuplesort_end function and
> tuplesort_performsort. These function changes a state of tuplesort.
> The most basic question is "who has to call tuplesort_end function and
> when?
Reading the comment in array_userfuncs.c, array_agg_finalfn() doesn't
clean up its internal state at all and tells it's the executor's
responsibility to clear memory. It is allowed since ArrayBuildState is
only in-memory state. In the other hand, TupleSort should be cleared
by calling tuplesort_end() if it has tapeset member (on file based
sort) to close physical files.
So 2 or 3 ways to go in my mind:
1. call tuplesort_begin_datum with INT_MAX workMem rather than the
global work_mem, to avoid it spills out sort state to files. It may
sounds dangerous, but actually memory exhausting can happen in
array_agg() as well.
2. add TupleSort an argument that tells not to use file at all. This
results in the same as #1 but more generic approach.
3. don't use tuplesort in median() but implement its original sort
management. This looks quite redundant and like maintenance problem.
#2 sounds like the best in generic and consistent way. The only point
is whether the change is worth for implementing median() as it's very
system-wide common fundamentals.
Other options?
Regards,
--
Hitoshi Harada
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-09-23 17:46:44 | Re: Why is time with timezone 12 bytes? |
Previous Message | Heikki Linnakangas | 2010-09-23 17:42:30 | Re: Configuring synchronous replication |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2010-09-23 18:27:38 | Re: wip: functions median and percentile |
Previous Message | Pavel Stehule | 2010-09-23 13:25:32 | Re: wip: functions median and percentile |