Our Internet connectivity failed as this was being sent. It looks
like at least the list didn't get it, so here goes another try.
Apologies for any duplication.
-Kevin
>>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I chewed on that for awhile. We can certainly optimize EXISTS
that's
> appearing in the ON-condition of a regular JOIN, because that's not
> really semantically different from a WHERE-condition.
Good to hear. I thought that might be doable. :-)
> But I don't think
> it's going to be reasonable to improve EXISTS in outer-JOIN ON
> conditions. There are a couple of problems. Consider
The discussion did make the difficulties clear.
> So this is not something I'm going to tackle; at least not this
> devel cycle.
Fair enough.
> One small step we can take in this direction, though, is to improve
the
> planner's internal handling of the qual conditions for IN and
EXISTS.
> Right now the process is just to throw the sub-select into the main
> range table and put the IN join conditions into the same place in
WHERE
> that the IN-clause was to start with. The trouble with this is that
the
> distribute_quals_to_rels processing has no idea that there's
anything
> special about the IN join conditions. We got away with that for the
> limited case of IN clauses at the top level of WHERE, but it's
become
> clear to me over the weekend that this has no hope of working for
NOT
> EXISTS --- since that's effectively an outer join, it absolutely has
to
> have the same kinds of qual-scheduling constraints as ordinary outer
> joins do. So we need a data structure that distribute_quals_to_rels
can
> work with. What I think needs to happen is that the initial pass
that
> pulls up optimizable IN/EXISTS sub-selects should not merge the
> SubLink's replacement qual clauses seamlessly, but put them
underneath a
> new node type, say "FlattenedSubLink", that retains knowledge of the
> join it's representing. The FlattenedSubLink would survive only as
far
> as distribute_quals_to_rels, which would distribute out the
contained
> qual conditions instead of the FlattenedSubLink itself --- but only
> after marking them properly for the outer-join restrictions. This
> representation would make it feasible to work with IN/EXISTS that
are
> inside JOIN ON conditions, which the present representation using a
> single in_info_list really can't do. The semantic issues are still
> there but at least the representation isn't getting in the way ...
Just curious, is that something for this cycle, or a TODO item?
Thanks for looking at this. The one part I'm not sure about is where
the CASE/EXISTS in the SELECT value list fits into this discussion.
It seems conceptually similar to the OUTER JOIN, but sort of a special
case, so I'm not sure what you had in mind there.
-Kevin