From: | Greg Stark <gsstark(at)mit(dot)edu> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Ed L(dot)" <pgsql(at)bluepolka(dot)net>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Why won't it index scan? |
Date: | 2006-05-18 05:04:53 |
Message-ID: | 87fyj7q5xm.fsf@stark.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> "Ed L." <pgsql(at)bluepolka(dot)net> writes:
> > So, does this sound like we just happened to get repeatedly
> > horribly unrepresentative random samples with stats target at
> > 10? Are we at the mercy of randomness here? Or is there a
> > better preventive procedure we can follow to systematically
> > identify this kind of situation?
>
> I think the real issue is that stats target 10 is too small for large
> tables: the samples are just not large enough to support a decent
> numdistinct estimate, which is the critical stat for cases such as this
> (ie, estimating the number of hits on a value that's not in the
> most-common-values list).
There's been some discussion on -hackers about this area. Sadly the idea of
using samples to calculate numdistinct estimates is fundamentally on pretty
shaky ground.
Whereas a fixed sample size works fine for calculating distribution of values,
in order to generate consistent precision for numdistinct estimates the
samples will have to be a constant fraction of the table -- and unfortunately
a pretty large fraction at that.
So sadly I think "at the mercy of randomness" is pretty accurate. You'll have
to raise the statistics target as the table grows and I expect you'll
eventually run into some downsides of large stats targets.
Some better algorithms were posted, but they would require full table scans
during analyze, not just samples.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Florian Weimer | 2006-05-18 05:25:09 | Re: Announce: GPL Framework centered on Postgres |
Previous Message | Tom Lane | 2006-05-18 02:48:19 | Re: Contributing code |