From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | sthomas(at)optionshouse(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Setting Statistics on Functional Indexes |
Date: | 2012-11-14 21:00:25 |
Message-ID: | 25730.1352926825@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Shouldn't there be a separate estimator for scalarlesel? Or should
> the existing estimator be adjusted to handle the two cases
> differently?
Well, it does handle it differently to some extent, in that the operator
itself is invoked when checking the MCV values, so we get the right
answer for those.
The fact that there's not separate estimators for < and <= is something
we inherited from Berkeley, so I can't give the original rationale for
certain, but I think the notion was that the difference is imperceptible
when dealing with a continuous distribution. The question is whether
you think that the "=" case contributes any significant amount to the
probability given that the bound is not one of the MCV values. (If it
is, the MCV check will have accounted for it, so adding anything would
be wrong.) I guess we could add 1/ndistinct or something like that,
but I'm not convinced that will really make the estimates better, mainly
because ndistinct is none too reliable itself.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2012-11-14 21:25:36 | Re: postgres 8.4, COPY, and high concurrency |
Previous Message | Claudio Freire | 2012-11-14 20:55:39 | Re: Setting Statistics on Functional Indexes |