| From: | Masaru Sugawara <rk73(at)echna(dot)ne(dot)jp> | 
|---|---|
| To: | pgsql-sql(at)postgresql(dot)org | 
| Subject: | Re: MEDIAN as custom aggregate? | 
| Date: | 2001-10-13 15:19:56 | 
| Message-ID: | 20011014001101.2BE2.RK73@echna.ne.jp | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-sql | 
On Fri, 12 Oct 2001 12:38:12 -0700
"Josh Berkus" wrote:
> For those whose stats terminology is rusty, the "median" is the "middle"
> value in a distribution.  For example, if we had the following data:
> 
> Table ages
> person		age
> Jim		21
> Rusty		24
> Carol		37
> Bob		62
> Leah		78
> 
> Our Median would be Carol's age, 37.  This is a different figure from
> the Mean, or Average, which is 44.4.  Using the combination of the Mean
> and the Median you can do all kinds of interesting statistical analysis.
> 
In such case of this, there needs "nextval" in a query to deal with
a large number of rows. I think the following query, is not smart, 
will return the necessary rows (or an average of the rows). But even
so it may need considerable time...
-- (on 7.1.3)
create sequence seq_ages start 1;
select a1.age, a1.rank   -- or select avg(a1.age)
  from (select a0.person, a0.age, (nextval('seq_ages') - 1) as rank
          from (select *, setval('seq_ages',1)   -- to reset a sequence
                  from ages
                  order by age   -- this insignificant "order by" is
                                 --  needed in order to work "setval"
                ) as a0
           order by a0.age
        ) as a1
  where exists (select * from ages
                   where a1.rank >= (select (count(*)+1)/2 from ages)
                     and a1.rank <= (select  count(*)/2+1  from ages)
                )
;
Regards,
Masaru Sugawara
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Lee Harr | 2001-10-13 21:51:22 | Re: GROUPING | 
| Previous Message | Aasmund Midttun Godal | 2001-10-13 14:57:06 | Re: Indexes |