From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: <> join selectivity estimate question |
Date: | 2017-03-17 19:11:30 |
Message-ID: | 25906.1489777890@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Mar 17, 2017 at 1:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It would not be too hard to convince me that neqjoinsel should
>> simply return 1.0 for any semijoin/antijoin case, perhaps with
>> some kind of discount for nullfrac. Whether or not there's an
>> equal row, there's almost always going to be non-equal row(s).
>> Maybe we can think of a better implementation but that seems
>> like the zero-order approximation.
> Yeah, it's not obvious how to do better than that considering only one
> clause at a time. Of course, what we really want to know is
> P(x<>y|z=t), but don't ask me how to compute that.
Yeah. Another hole in this solution is that it means that the
estimate for x <> y will be quite different from the estimate
for NOT(x = y). You wouldn't notice it in the field unless
somebody forgot to put a negator link on their equality operator,
but it seems like ideally we'd think of a solution that made sense
for generic NOT in this context.
No, I have no idea how to do that.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2017-03-17 20:08:32 | Re: pageinspect and hash indexes |
Previous Message | Tom Lane | 2017-03-17 19:06:14 | Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver |