From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Jim Nasby <jim(at)nasby(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: ANALYZE sampling is too good |
Date: | 2013-12-11 00:48:37 |
Message-ID: | CAM3SWZRB8dH5RUzpi_fPD3yakFEivoHo1v69Yrv8yDoU+S7DxQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 10, 2013 at 4:14 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> err, so what does stats target mean exactly in statistical theory?
Why would I even mention that to a statistician? We want guidance. But
yes, I bet I could give a statistician an explanation of statistics
target that they'd understand without too much trouble.
> Waiting for a statistician, and confirming his credentials before you
> believe him above others here, seems like wasted time.
>
> What your statistician will tell you is it that YMMV, depending on the data.
I'm reasonably confident that they'd give me more than that.
> So we'll still need a parameter to fine tune things when the default
> is off. We can argue about the default later, in various level of
> rigour.
>
> Block sampling, with parameter to specify sample size. +1
Again, it isn't as if the likely efficacy of *some* block sampling
approach is in question. I'm sure analyze.c is currently naive about
many things. Everyone knows that there are big gains to be had.
Balancing those gains against the possible downsides in terms of
impact on the quality of statistics generated is pretty nuanced. I do
know enough to know that a lot of thought goes into mitigating and/or
detecting the downsides of block-based sampling.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2013-12-11 00:54:36 | Re: [COMMITTERS] pgsql: Add a new reloption, user_catalog_table. |
Previous Message | Greg Stark | 2013-12-11 00:44:59 | Re: ANALYZE sampling is too good |