From: | Sam Mason <sam(at)samason(dot)me(dot)uk> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Review: Revise parallel pg_restore's scheduling heuristic |
Date: | 2009-08-07 15:33:07 |
Message-ID: | 20090807153307.GI5407@samason.me.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 07, 2009 at 10:19:20AM -0500, Kevin Grittner wrote:
> Sam Mason <sam(at)samason(dot)me(dot)uk> wrote:
>
> > What do people do when testing this? I think I'd look to something
> > like Student's t-test to check for statistical significance. My
> > working would go something like:
> >
> > I assume the variance is the same because it's being tested on the
> > same machine.
> >
> > samples = 20
> > stddev = 144.26
> > avg1 = 4783.13
> > avg2 = 4758.46
> > t = 0.54 ((avg1 - avg2) / (stddev * sqrt(2/samples)))
> >
> > We then have to choose how certain we want to be that they're
> > actually different, 90% is a reasonably easy level to hit (i.e. one
> > part in ten, with 95% being more commonly quoted). For 20 samples
> > we have 19 degrees of freedom--giving us a cut-off[1] of 1.328.
> > 0.54 is obviously well below this allowing us to say that there's no
> > "statistical significance" between the two samples at a 90% level.
>
> Thanks for the link; that looks useful. To confirm that I understand
> what this has established (or get a bit of help putting in in
> perspective), what this says to me, in the least technical jargon I
> can muster, is "With this many samples and this degree of standard
> deviation, the average difference is not large enough to have a 90%
> confidence level that the difference is significant." In fact,
> looking at the chart, it isn't enough to reach a 75% confidence level
> that the difference is significant. Significance here would seem to
> mean that at least the given percentage of the time, picking this many
> samples from an infinite set with an average difference that really
> was this big or bigger would generate a value for t this big or
> bigger.
>
> Am I close?
Yes, all that sounds as though you've got it. Note that running the
test more times will tend to reduce the standard deviation a bit as
well, so it may well become significant. In this case it's unlikely to
affect it much though.
> I like to be clear, because it's easy to get confused and take the
> above to mean that there's a 90% confidence that there is no actual
> significant difference in performance based on that sampling. (Given
> Tom's assurance that this version of the patch should have similar
> performance to the last, and the samples from the prior patch went the
> other direction, I'm convinced there is not a significant difference,
> but if I'm going to use the referenced calculations, I want to be
> clear how to interpret the results.)
All we're saying is that we're less than 90% confident that there's
something "significant" going on. All the fiddling with standard
deviations and sample sizes is just easiest way (that I know of) that
statistics currently gives us of determining this more formally than a
hand-wavy "it looks OK to me". Science tells us that humans are liable
to say things are OK when they're not, as well as vice versa; statistics
gives us a way to work past these limitations in some common and useful
situations.
--
Sam http://samason.me.uk/
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2009-08-07 15:35:39 | Re: Alpha releases: How to tag |
Previous Message | Kenneth Marshall | 2009-08-07 15:29:27 | Re: Fixing geometic calculation |