From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Weird, bad 0.5% selectivity estimate for a column equal to itself |
Date: | 2013-06-26 01:41:47 |
Message-ID: | 22653.1372210907@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Personally, I'll bet lunch that that external software is outright
>> broken, ie it probably thinks "X = X" is constant true and they found
>> they could save two lines of code and a few machine cycles by emitting
>> that rather than not emitting anything.
> Well, it was more in the form of:
> tab1.x = COALESCE(tab2.y,tab1.x)
Hm. I'm not following how you get from there to complaining about not
being smart about X = X, because that surely ain't the same.
> Well, I'd be more satisfied with having a solution for:
> WHERE tab1.x = tab1.y
> ... in general, even if it didn't have correlation stats. Like, what's
> preventing us from using the same selectivity logic we would on a join
> for that?
It's a totally different case. In the join case you expect that each
element of one table will be compared with each element of the other.
In the single-table case, that's exactly *not* what will happen, and
I don't see how you get to anything very useful without knowing
something about the value pairs that actually occur. As a concrete
example, applying the join selectivity logic would certainly give a
completely wrong answer for X = X, unless there were only one value
occurring in the column.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Willy-Bas Loos | 2013-06-26 15:45:15 | seqscan for 100 out of 3M rows, index present |
Previous Message | Ben | 2013-06-26 00:29:01 | Re: incorrect row estimates for primary key join |