Quick Links

Re: ANALYZE sampling is too good

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ANALYZE sampling is too good
Date:	2013-12-12 22:25:22
Message-ID:	D1C450BF-DFBF-4330-8C7B-5F8A69BE49D1@phlo.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Dec12, 2013, at 19:29 , Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> However ... where this thread started was not about trying to reduce
> the remaining statistical imperfections in our existing sampling method.
> It was about whether we could reduce the number of pages read for an
> acceptable cost in increased statistical imperfection.

True, but Jeff's case shows that even the imperfections of the current
sampling method are larger than what the n_distinct estimator expects.

Making it even more biased will thus require us to rethink how we
obtain a n_distinct estimate or something equivalent. I don't mean that
as an argument against changing the sampling method, just as something
to watch out for.

best regards,
Florian Pflug

In response to

Re: ANALYZE sampling is too good at 2013-12-12 18:29:40 from Tom Lane

Responses

Re: ANALYZE sampling is too good at 2013-12-16 22:06:08 from Jeff Janes

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2013-12-12 22:40:59	Re: pgsql: Fix a couple of bugs in MultiXactId freezing
Previous Message	Fabrízio de Royes Mello	2013-12-12 21:58:56	Re: Time-Delayed Standbys