Quick Links

Re: Setting Statistics on Functional Indexes

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	sthomas(at)optionshouse(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Setting Statistics on Functional Indexes
Date:	2012-11-14 20:36:19
Message-ID:	CA+Tgmob5pBYZTTOF3_aRk3HWY9=sZTBp+2-OP3=3eumrcu8i9w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Fri, Oct 26, 2012 at 5:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> So the bottom line is that this is a case where you need a lot of
> resolution in the histogram. I'm not sure there's anything good
> we can do to avoid that. I spent a bit of time thinking about whether
> we could use n_distinct to get some idea of how many duplicates there
> might be for the endpoint value, but n_distinct is unreliable enough
> that I can't develop a lot of faith in such a thing. Or we could just
> arbitarily assume some fraction-of-a-histogram-bin's worth of
> duplicates, but that would make the results worse for some people.

I looked at this a bit. It seems to me that the root of this issue is
that we aren't distinguishing (at least, not as far as I can see)
between > and >=. ISTM that if the operator is >, we're doing exactly
the right thing, but if it's >=, we're giving exactly the same
estimate that we would give for >. That doesn't seem right.

Worse, I suspect that in this case we're actually giving a smaller
estimate for >= than we would for =, because = would estimate as if we
were searching for an arbitrary non-MCV, while >= acts like > and
says, hey, there's nothing beyond the end.

Shouldn't there be a separate estimator for scalarlesel? Or should
the existing estimator be adjusted to handle the two cases
differently?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Setting Statistics on Functional Indexes at 2012-10-26 21:08:18 from Tom Lane

Responses

Re: Setting Statistics on Functional Indexes at 2012-11-14 20:55:39 from Claudio Freire
Re: Setting Statistics on Functional Indexes at 2012-11-14 21:00:25 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Claudio Freire	2012-11-14 20:55:39	Re: Setting Statistics on Functional Indexes
Previous Message	Jon Nelson	2012-11-14 20:04:12	Re: postgres 8.4, COPY, and high concurrency