Re: Weird, bad 0.5% selectivity estimate for a column equal to itself

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Weird, bad 0.5% selectivity estimate for a column equal to itself
Date: 2013-06-25 23:10:40
Message-ID: 51CA2370.7040803@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


> Personally, I'll bet lunch that that external software is outright
> broken, ie it probably thinks "X = X" is constant true and they found
> they could save two lines of code and a few machine cycles by emitting
> that rather than not emitting anything. Of course, the amount of
> parsing/planning time wasted in dealing with the useless-and-incorrect
> clause exceeds what was saved by multiple orders of magnitude, but hey
> it was easy.

Well, it was more in the form of:

tab1.x = COALESCE(tab2.y,tab1.x)

... which some programmer 8 years ago though would be a cute shorthand for:

tab.x = tab2.y OR tab2.y IS NULL

Still stupid, sure, but when you're dealing with partly-third-party
legacy software which was ported from MSSQL (which has issues with "IS
NULL"), that's what you get.

> It wouldn't take too much new code to get the planner to replace "X = X"
> with "X IS NOT NULL", but I think we're probably fixing the wrong piece
> of software if we do.

Well, I'd be more satisfied with having a solution for:

WHERE tab1.x = tab1.y

... in general, even if it didn't have correlation stats. Like, what's
preventing us from using the same selectivity logic we would on a join
for that? It wouldn't be accurate for highly correlated columns (or for
colX = colx) but it would be a damsight better than defaultsel. Heck,
even multiplying the the two ndistincts together would be an improvement ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2013-06-25 23:36:17 Re: incorrect row estimates for primary key join
Previous Message Kevin Grittner 2013-06-25 22:13:15 Re: incorrect row estimates for primary key join