Quick Links

Re: how to identify outliers

From:	"Rhys A(dot)D(dot) Stewart" <rhys(dot)stewart(at)gmail(dot)com>
To:	Ben Chobot <bench(at)silentmedia(dot)com>, pgsql-general(at)postgresql(dot)org
Subject:	Re: how to identify outliers
Date:	2009-10-27 22:37:10
Message-ID:	189966030910271537r48499d04s280fe0311b5b838c@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Im asking how to get the ones that dont fall near the avg.... so for
example lets say i have the following distances:
10,11,12,11,10,9,9,10,11,12,10,11,99

then 99 would be an outlier. the avg would be like 16 or 17 i reckon
with the 99. so i want a way to find aan outlier, remove it and then
recalcuate the avg...and then i'd get a 'better' avg.....

i did some seraching about outliers and most of hits are about R or
spss or some other statistical package.....so looking for a way to do
it wholly in pgsql.

Rhys

On Tue, Oct 27, 2009 at 4:53 PM, Ben Chobot <bench(at)silentmedia(dot)com> wrote:
> Are you asking how to find the average and standard deviation? Or how to
> compare the your data against some set values? Perhaps an example would be
> appropriate; it's not very clear to me what you're asking.
>
> Rhys A.D. Stewart wrote:
>>
>> Hey all,
>> I have the following table: data(pnum text, distance float8, route text).
>> I would like to remove the outliers in distance, i.e. lets say i get
>> the avg dist of pnum for each route and the std deviation of the
>> distance what is the best way to identify the outliers?
>>
>>
>> Rhys.
>>
>>
>

In response to

how to identify outliers at 2009-10-27 21:36:12 from Rhys A.D. Stewart

Responses

Re: how to identify outliers at 2009-10-27 22:56:04 from Alvaro Herrera
Re: how to identify outliers at 2009-10-27 23:04:47 from Scott Bailey

Browse pgsql-general by date

	From	Date	Subject
Next Message	Alan Hodgson	2009-10-27 22:44:56	Re: Slow running query with views...how to increase efficiency? with index?
Previous Message	fox7	2009-10-27 22:11:31	Slow running query with views...how to increase efficiency? with index?