From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Jim Nasby <jim(at)nasby(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: ANALYZE sampling is too good |
Date: | 2013-12-11 00:14:58 |
Message-ID: | CA+U5nMKQ-b=34u37A1yOMOExQ2me+Tif8_-cYHDM3vODOrLDuA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10 December 2013 23:43, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Tue, Dec 10, 2013 at 3:26 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>>> I agree that looking for information on block level sampling
>>> specifically, and its impact on estimation quality is likely to not
>>> turn up very much, and whatever it does turn up will have patent
>>> issues.
>>
>>
>> We have an entire analytics dept. at work that specializes in finding
>> patterns in our data. I might be able to get some time from them to at least
>> provide some guidance here, if the community is interested. They could
>> really only serve in a consulting role though.
>
> I think that Greg had this right several years ago: it would probably
> be very useful to have the input of someone with a strong background
> in statistics. It doesn't seem that important that they already know a
> lot about databases, provided they can understand what our constraints
> are, and what is important to us. It might just be a matter of having
> them point us in the right direction.
err, so what does stats target mean exactly in statistical theory?
Waiting for a statistician, and confirming his credentials before you
believe him above others here, seems like wasted time.
What your statistician will tell you is it that YMMV, depending on the data.
So we'll still need a parameter to fine tune things when the default
is off. We can argue about the default later, in various level of
rigour.
Block sampling, with parameter to specify sample size. +1
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2013-12-11 00:22:20 | Re: Why we are going to have to go DirectIO |
Previous Message | Robert Haas | 2013-12-11 00:11:03 | Re: logical changeset generation v6.8 |