| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | nickf(at)ontko(dot)com |
| Cc: | "pgsql-admin" <pgsql-admin(at)postgresql(dot)org> |
| Subject: | Re: Problem with n_distinct being consistently inaccurate. |
| Date: | 2003-09-23 17:37:16 |
| Message-ID: | 20330.1064338636@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-admin |
"Nick Fankhauser" <nickf(at)ontko(dot)com> writes:
> So the question is- how can I get a better estimate of n_distinct from
> analyze?
> If I alter the stats target as high as it will go, I get closer, but it
> still shows the index to be about 1/2 as selective as it actually is:
AFAIK, estimating number of distinct values from a small sample is
inherently an ill-conditioned problem. You should probably be happy
it can get within a factor of 2 ;-).
You could try sticking the correct n_distinct into pg_statistic by hand
just to see if it really does change the plan, but I'd like to think
that getting within a factor of 2 is good enough. If it's not, then we
probably ought to look for ways to avoid using number-of-distinct-values
statistics altogether, because we'll seldom have a hard value for it.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | pierre bach | 2003-09-23 17:37:24 | vacuum failed - pgtoast not btree |
| Previous Message | Priya G | 2003-09-23 17:16:48 | Re: help needed!!! |