Quick Links

Re: Calculating Median value

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Chandru Aroor <caroor(at)yahoo(dot)com>
Cc:	"pgsql-novice(at)lists(dot)postgresql(dot)org" <pgsql-novice(at)lists(dot)postgresql(dot)org>
Subject:	Re: Calculating Median value
Date:	2018-05-06 18:13:47
Message-ID:	23810.1525630427@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-novice

Chandru Aroor <caroor(at)yahoo(dot)com> writes:
> Thanks Tom. You are actually correct. In talking with the business user, we need to get the 50th percentile, and picking the higher value for even number of members, so median as Excel calculates it is actually not appropriate. Will aggregate percentile_cont, or possibly percentile_disc do it?

If you don't want to interpolate between the two middle values, you must
use percentile_disc not percentile_cont --- that's exactly what the
difference is between those two functions.

I think though that percentile_disc picks the first not the second of the
two middle values. You can fix that by reversing the sort order, so it'd
look like

SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY x DESC) FROM ...

Here's some examples;

regression=# SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY x) FROM (VALUES (1),(3),(7),(9)) v(x);
percentile_disc
-----------------
3
(1 row)

regression=# SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY x DESC) FROM (VALUES (1),(3),(7),(9)) v(x);
percentile_disc
-----------------
7
(1 row)

regression=# SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY x) FROM (VALUES (1),(3),(7),(9)) v(x);
percentile_cont
-----------------
5
(1 row)

regards, tom lane

In response to

Re: Calculating Median value at 2018-05-06 17:06:23 from Chandru Aroor

Browse pgsql-novice by date

	From	Date	Subject
Next Message	Amit S.	2018-05-06 23:04:25	Re: Postgres warm standby with delay
Previous Message	Chandru Aroor	2018-05-06 17:06:23	Re: Calculating Median value