Quick Links

Re: Problem with n_distinct being consistently inaccurate.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	nickf(at)ontko(dot)com
Cc:	"pgsql-admin" <pgsql-admin(at)postgresql(dot)org>
Subject:	Re: Problem with n_distinct being consistently inaccurate.
Date:	2003-09-23 17:37:16
Message-ID:	20330.1064338636@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin

"Nick Fankhauser" <nickf(at)ontko(dot)com> writes:
> So the question is- how can I get a better estimate of n_distinct from
> analyze?
> If I alter the stats target as high as it will go, I get closer, but it
> still shows the index to be about 1/2 as selective as it actually is:

AFAIK, estimating number of distinct values from a small sample is
inherently an ill-conditioned problem. You should probably be happy
it can get within a factor of 2 ;-).

You could try sticking the correct n_distinct into pg_statistic by hand
just to see if it really does change the plan, but I'd like to think
that getting within a factor of 2 is good enough. If it's not, then we
probably ought to look for ways to avoid using number-of-distinct-values
statistics altogether, because we'll seldom have a hard value for it.

regards, tom lane

In response to

Problem with n_distinct being consistently inaccurate. at 2003-09-23 16:38:47 from Nick Fankhauser

Responses

Re: Problem with n_distinct being consistently inaccurate. at 2003-09-23 20:03:57 from Nick Fankhauser

Browse pgsql-admin by date

	From	Date	Subject
Next Message	pierre bach	2003-09-23 17:37:24	vacuum failed - pgtoast not btree
Previous Message	Priya G	2003-09-23 17:16:48	Re: help needed!!!