Quick Links

Re: Idea about estimating selectivity for single-column expressions

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Idea about estimating selectivity for single-column expressions
Date:	2009-08-19 19:00:36
Message-ID:	4A8C4BD4.4060509@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom, Greg, Robert,

Here's my suggestion:

1. First, estimate the cost of the node with a very pessimistic (50%?)
selectivity for the calculation.

2. If the cost hits a certain threshold, then run the calculation
estimation on the histogram.

That way, we avoid the subtransaction and other overhead on very small sets.

also:

> Trying it on the MCVs makes a lot of sense. I'm not so sure about
> trying it on the histogram entries. There's no reason to assume that
> those cluster in any way that will be useful. (For example, suppose
> that I have the numbers 1 through 10,000 in some particular column and
> my expression is col % 100.)

Yes, but for seriously skewed column distributions, the difference in
frequency between the MCV and a sample "random" distribution will be
huge. And it's precisely those distributions which are currently
failing in the query planner.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

In response to

Re: Idea about estimating selectivity for single-column expressions at 2009-08-19 18:51:18 from Tom Lane

Responses

Re: Idea about estimating selectivity for single-column expressions at 2009-08-19 19:04:29 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2009-08-19 19:01:16	Re: We should Axe /contrib/start-scripts
Previous Message	Alvaro Herrera	2009-08-19 19:00:00	Re: We should Axe /contrib/start-scripts