From: | "John D(dot) Burger" <john(at)mitre(dot)org> |
---|---|
To: | PostgreSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Q:Aggregrating Weekly Production Data. How do you do it? |
Date: | 2007-09-18 12:24:55 |
Message-ID: | 1BF89005-FC4E-4E0A-9B1E-E342B0080EC8@mitre.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Ow Mun Heng wrote:
> The results are valid (verified with actual data) but I don't
> understand
> the logic. All the Statistical books I've read marked stdev as sqrt
> (sum(x - ave(x))^2 / (n - 1). The formula is very different, hence the
> confusion.
A formula is not an algorithm. In particular, the naive way of
calculating variance or standard deviation has massive numerical
instability problems - anything involving sums of squares does.
There are a variety of alternate algorithms for stddev/variance, I
presume your other algorithm is similarly trying to avoid these same
issues (but I have not looked closely at it). You can also see
Wikipedia for one of the most well known, due to Knuth/Wellford:
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
- John D. Burger
MITRE
From | Date | Subject | |
---|---|---|---|
Next Message | btober | 2007-09-18 12:37:09 | Re: keeping 3 tables in sync w/ each other |
Previous Message | Phoenix Kiula | 2007-09-18 12:15:22 | Re: For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER |