Quick Links

Re: ANALYZE sampling is too good

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	Florian Pflug <fgp(at)phlo(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ANALYZE sampling is too good
Date:	2013-12-12 18:29:40
Message-ID:	24713.1386872980@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> It would be relatively easy to fix this if we trusted the number of visible
> rows in each block to be fairly constant. But without that assumption, I
> don't see a way to fix the sample selection process without reading the
> entire table.

Yeah, varying tuple density is the weak spot in every algorithm we've
looked at. The current code is better than what was there before, but as
you say, not perfect. You might be entertained to look at the threads
referenced by the patch that created the current sampling method:
http://www.postgresql.org/message-id/1tkva0h547jhomsasujt2qs7gcgg0gtvrp@email.aon.at

particularly
http://www.postgresql.org/message-id/flat/ri5u70du80gnnt326k2hhuei5nlnimonbs(at)email(dot)aon(dot)at#ri5u70du80gnnt326k2hhuei5nlnimonbs@email.aon.at

However ... where this thread started was not about trying to reduce
the remaining statistical imperfections in our existing sampling method.
It was about whether we could reduce the number of pages read for an
acceptable cost in increased statistical imperfection.

regards, tom lane

In response to

Re: ANALYZE sampling is too good at 2013-12-12 17:44:39 from Jeff Janes

Responses

Re: ANALYZE sampling is too good at 2013-12-12 18:33:49 from Claudio Freire
Re: ANALYZE sampling is too good at 2013-12-12 22:25:22 from Florian Pflug

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Claudio Freire	2013-12-12 18:33:49	Re: ANALYZE sampling is too good
Previous Message	Merlin Moncure	2013-12-12 18:26:47	Re: In-Memory Columnar Store