Re: Setting Statistics on Functional Indexes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: sthomas(at)optionshouse(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Setting Statistics on Functional Indexes
Date: 2012-11-14 21:00:25
Message-ID: 25730.1352926825@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Shouldn't there be a separate estimator for scalarlesel? Or should
> the existing estimator be adjusted to handle the two cases
> differently?

Well, it does handle it differently to some extent, in that the operator
itself is invoked when checking the MCV values, so we get the right
answer for those.

The fact that there's not separate estimators for < and <= is something
we inherited from Berkeley, so I can't give the original rationale for
certain, but I think the notion was that the difference is imperceptible
when dealing with a continuous distribution. The question is whether
you think that the "=" case contributes any significant amount to the
probability given that the bound is not one of the MCV values. (If it
is, the MCV check will have accounted for it, so adding anything would
be wrong.) I guess we could add 1/ndistinct or something like that,
but I'm not convinced that will really make the estimates better, mainly
because ndistinct is none too reliable itself.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2012-11-14 21:25:36 Re: postgres 8.4, COPY, and high concurrency
Previous Message Claudio Freire 2012-11-14 20:55:39 Re: Setting Statistics on Functional Indexes