Re: EXISTS clauses not being optimized in the face of 'one time pass' optimizable expressions

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: EXISTS clauses not being optimized in the face of 'one time pass' optimizable expressions
Date: 2016-07-01 16:00:58
Message-ID: CAHyXU0xVjiQR1pdbA1Cutmu3OjLHML4_CZiAdg3astogRKxWLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 1, 2016 at 10:27 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Jul 1, 2016 at 10:20 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> Yeah. Also, even if you could parse out those cases, it's major
>> optimization fence. Consider if you have an ORDER BY clause here:
>>
>> SELECT FROM foo WHERE a OR b ORDER BY c;
>>
>> ... by pushing inside a union, you're going to be in trouble in real
>> world cases. That's just a mess and it would add a lot of runtime
>> analysis of the alternative paths. It's hard for me to believe
>> rewriting is easier and simpler than rewriting 'false OR x' to 'x'. I
>> also thing that constant folding strategies are going to render much
>> more sensible output to EXPLAIN.
>
> I don't think that it's easier and simpler and didn't intend to say
> otherwise. I do think that I've run across LOTS of queries over the
> years where rewriting OR using UNION ALL was a lot faster, and I think
> that case is more likely to occur in practice than FALSE OR WHATEVER.
> But, I'm just throwing out opinions to see what sticks here; I'm not
> deeply invested in this.

Sure (I didn't put you on that position, just thinking out loud). The
problem with UNION ALL is that it's only safe to do so when you know
for sure the both sides of the partition are non-overlapping. The
author of the query often knows this going in but for the planner it's
not so simple to figure out in many cases. If there's a subset of
cases. UNION sans ALL is probably a dead end on performance grounds.

This hinges on Tom's earlier statements, "Much of
the value of doing constant-folding would disappear if we ran it before
subquery pullup + join simplification, because in non-stupidly-written
queries those are what expose the expression simplification opportunities."

and, especially, "We could run it twice but that seems certain to be a
dead loser most of
the time."

It's pretty easy to craft a query where you're on the winning side,
but what's the worst case of doing two pass...is constant folding a
non trivial fraction of planning time?

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2016-07-01 16:02:38 Re: EXISTS clauses not being optimized in the face of 'one time pass' optimizable expressions
Previous Message Robert Haas 2016-07-01 16:00:34 Re: Documentation fixes for pg_visibility