Re: stddev returns 0 when there is one row

From: Murthy Kambhampaty <murthy(dot)kambhampaty(at)goeci(dot)com>
To: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Manfred Koizar <mkoi-pg(at)aon(dot)at>, Bruno Wolff III <bruno(at)wolff(dot)to>, pgsql-general(at)postgresql(dot)org
Subject: Re: stddev returns 0 when there is one row
Date: 2003-04-21 16:56:49
Message-ID: 2D92FEBFD3BE1346A6C397223A8DD3FC0921FA@THOR.goeci.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

FWIW, it seems that, at least in the long run, the variance(expression) and
stddev(expression) functions, should take two optional arguments, one that
specifies whether to return the sample or population measure, and the other
that identifies the count() column, and the processing would go like this
(pardon my Bash-ism):

variance(X,ARG1,ARG2) () {
if [[ ARG2 != NULL ]]; then
SUMX2 = sum( X**2 *"ARG2");
SUMX = sum( X *"ARG2");
N = sum( "ARG2");
else
SUMX2 = sum( X**2 );
SUMX = sum( X );
N = count( X);
if

if [[ ARG1 = 0 ]]; then # let 0 specify the sample measure
varX = (SUMX2 - SUMX *SUMX/N)/(N -1) #there was a typo in my
earlier msg: missing "/sum( freq)" in the numerator
elif [[ ARG1 = 1 ]]; then
varX = (SUMX2 - SUMX *SUMX/N)/(N)
fi
}

It's easy enough to do client side, for now.
Murthy

PS: When you said, in response to Joe Conway's post re the SQL200x spec,
that you'd make the changes, I read it to mean you were going to implement
both the sample and population measures, but on second reading I think you
were just been saying that you'd make the existing function properly return
the sample standard deviation - NULL if only 1 row given?

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Monday, April 21, 2003 12:21
To: Murthy Kambhampaty
Cc: Joe Conway; Manfred Koizar; Bruno Wolff III;
pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] stddev returns 0 when there is one row

Murthy Kambhampaty <murthy(dot)kambhampaty(at)goeci(dot)com> writes:
> Tom, would it be too much trouble to program these in (as STDEV_POP_G and
> STDEV_SMPL_G or some variant?

I'm not volunteering, no. (For starters, I think you'd need to upgrade
the aggregate code to allow multiple-input aggregates. Which is worth
doing, but I do not expect to have time to think about such things for
7.4.)

regards, tom lane

Browse pgsql-general by date

  From Date Subject
Next Message Neil Conway 2003-04-21 17:06:01 Re: Are we losing momentum?
Previous Message Joe Conway 2003-04-21 16:34:15 Re: Using TEXT columns for binary content