Quick Links

Re: ANALYZE sampling is too good

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Greg Stark <stark(at)mit(dot)edu>
Cc:	Peter Geoghegan <pg(at)heroku(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ANALYZE sampling is too good
Date:	2013-12-11 15:16:50
Message-ID:	CA+U5nMLW0yZ3JyuLc5=gcBj4RtV-BdC4zewmwaPu-tFABXaBqA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 11 December 2013 12:08, Greg Stark <stark(at)mit(dot)edu> wrote:

> So there is something clearly wonky in the histogram stats that's
> affected by the distribution of the sample.

...in the case where the avg width changes in a consistent manner
across the table.

Well spotted.

ISTM we can have a specific cross check for bias in the sample of that
nature. We just calculate the avg width per block and then check for
correlation of the avg width against block number. If we find bias we
can calculate how many extra blocks to sample and from where.

There may be other biases also, so we can check for them and respond
accordingly.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: ANALYZE sampling is too good at 2013-12-11 12:08:21 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Sullivan	2013-12-11 15:28:21	Re: Case sensitivity
Previous Message	Dev Kumkar	2013-12-11 15:16:25	Re: Case sensitivity