Re: ANALYZE sampling is too good

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ANALYZE sampling is too good
Date: 2013-12-10 20:19:14
Message-ID: 52A77742.9070105@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/10/2013 10:00 PM, Simon Riggs wrote:
> On 10 December 2013 19:54, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> On 12/10/2013 11:49 AM, Peter Geoghegan wrote:
>>> On Tue, Dec 10, 2013 at 11:23 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> I don't think that anyone believes that not doing block sampling is
>>> tenable, fwiw. Clearly some type of block sampling would be preferable
>>> for most or all purposes.
>>
>> As discussed, we need math though. Does anyone have an ACM subscription
>> and time to do a search? Someone must. We can buy one with community
>> funds, but no reason to do so if we don't have to.
>
> We already have that, just use Vitter's algorithm at the block level
> rather than the row level.

And what do you do with the blocks? How many blocks do you choose?
Details, please.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2013-12-10 20:37:50 Re: Reference to parent query from ANY sublink
Previous Message Peter Geoghegan 2013-12-10 20:17:43 Re: ANALYZE sampling is too good