Quick Links

Re: default_statistics_target WAS: max_wal_senders must die

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: default_statistics_target WAS: max_wal_senders must die
Date:	2010-10-21 01:41:36
Message-ID:	4CBF9A50.8040604@agliodbs.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> I don't see why the MCVs would need a particularly large sample size
> to calculate accurately. Have you done any tests on the accuracy of
> the MCV list?

Yes, although I don't have them at my fingertips. In sum, though, you
can't take 10,000 samples from a 1b row table and expect to get a
remotely accurate MCV list.

A while back I did a fair bit of reading on ndistinct and large tables
from the academic literature. The consensus of many papers was that it
took a sample of at least 3% (or 5% for block-based) of the table in
order to have 95% confidence in ndistinct of 3X. I can't imagine that
MCV is easier than this.

> And mostly
> what it tells me is that we need a robust statistical method and the
> data structures it requires for estimating the frequency of a single
> value.

Agreed.

> Binding the length of the MCV list to the size of the histogram is
> arbitrary but so would any other value and I haven't seen anyone
> propose any rationale for any particular value.

histogram size != sample size. It is in our code, but that's a bug and
not a feature.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

In response to

Re: default_statistics_target WAS: max_wal_senders must die at 2010-10-21 01:16:27 from Greg Stark

Responses

Re: default_statistics_target WAS: max_wal_senders must die at 2010-10-21 02:45:07 from Tom Lane
Re: default_statistics_target WAS: max_wal_senders must die at 2010-10-21 04:03:31 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2010-10-21 01:49:23	Re: default_statistics_target WAS: max_wal_senders must die
Previous Message	Robert Haas	2010-10-21 01:34:16	lazy snapshots?