Quick Links

Re: On Distributions In 7.2.1

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Mark kirkwood <markir(at)slingshot(dot)co(dot)nz>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: On Distributions In 7.2.1
Date:	2002-05-02 05:00:51
Message-ID:	4886.1020315651@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Mark kirkwood <markir(at)slingshot(dot)co(dot)nz> writes:
> There is slightly odd behaviour with the frequencies decreasing with
> increasing number of quantiles (same as 7.2 .. same code here ?).

That does seem curious. With the inevitable sampling error, you'd
expect that some values would be sampled at a bit more than their
true frequency, and others at a bit less. The oversampled ones would
be the ones to get into the MCV list. But what you've got here is
that even the most-commonly-sampled value showed up at a bit less
than its true frequency. Is this repeatable if you do ANALYZE over
and over? Maybe it was just a statistical fluke.

> I am wondering if this is caused by my example not having any "real" most
> common values (they are all as common as each other).
> I am going to fiddle with my data generation script, skew the
> distribution and see what effect that has.

Someone else reported some results that made it look like a logarithmic
frequency distribution was a difficult case for the stats gatherer:
http://archives.postgresql.org/pgsql-general/2002-03/msg01300.php
So please be sure to try that.

regards, tom lane

In response to

On Distributions In 7.2.1 at 2002-05-02 04:17:47 from Mark kirkwood

Responses

Re: On Distributions In 7.2.1 at 2002-05-02 09:36:37 from Mark kirkwood

Browse pgsql-general by date

	From	Date	Subject
Next Message	Hiroshi Inoue	2002-05-02 05:48:00	Re: Using views and MS access via odbc
Previous Message	Tom Lane	2002-05-02 04:45:19	Re: Mac OS X: system shutdown prevents checkpoint