From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Marko Tiikkaja <marko(at)joh(dot)to> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: 200 = 199 + 1? |
Date: | 2017-10-01 21:27:53 |
Message-ID: | 29652.1506893273@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
> I experimented a bit with the attached patch, which modifies
> eqjoinsel_semi in two ways. First, if the number-of-distinct-values
> estimate for the inner rel is just a default rather than having any
> real basis, it replaces it with inner_rel->rows, effectively assuming
> that the inside of the IN or EXISTS is unique.
That might or might not be a good idea ...
> Second, it drops the
> fallback to selectivity 0.5 altogether, just applying the nd1 vs nd2
> heuristic all the time.
... but this probably isn't. A review of the history shows me that
this change amounts to reverting 3f5d2fe30, which doesn't seem like
a good plan because that was fixing user-reported misbehavior.
The problem is still that the nd2/nd1 calculation can produce garbage
if nd2 or nd1 is made-up. It's possible that we can get away with using
nd2 = inner_rel->rows as a suitable substitute for a default nd2, but
I'm much less happy to assume something like that for nd1.
Now, there's still not much to defend the 0.5 selectivity in particular;
according to the commit log for 3f5d2fe30, I used that because it
reproduced the behavior of previous versions that didn't understand what
a semijoin was at all. So we could perhaps substitute some other rule
there, but I don't know what.
I also note that the precise behavior of HEAD in this area only dates
back to ca5f88502, which hasn't even shipped in a release yet, so it
surely doesn't have a lot of field experience justifying it.
Other useful-though-not-terribly-recent discussions in this area include
https://www.postgresql.org/message-id/flat/201201112110.40403.andres%40anarazel.de
https://www.postgresql.org/message-id/13290.1335976455@sss.pgh.pa.us
Given that eqjoinsel_semi has been more or less a constant source of
issues, maybe we need to think about a wholesale redesign rather than
just incremental tweaking. But I have few ideas about what would be
better.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2017-10-01 21:47:52 | Re: pg_ctl kill support for KILL signal was Re: [COMMITTERS] pgsql: Add test for postmaster crash restarts. |
Previous Message | Daniel Gustafsson | 2017-10-01 21:27:30 | Re: [PATCH] Off-by-one error in logical slot resource retention |