Re: EXISTS clauses not being optimized in the face of 'one time pass' optimizable expressions

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: EXISTS clauses not being optimized in the face of 'one time pass' optimizable expressions
Date: 2016-07-01 14:20:52
Message-ID: CAHyXU0xrUGZ1x2W2kvHmMppjhB-8WPoQ3YiZSkM1VtKHx4XiHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 1, 2016 at 9:11 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Jul 1, 2016 at 9:52 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Maybe, but neither UNION nor UNION ALL would duplicate the semantics
>>> of OR, so there's some handwaving here that I missed.
>
>> SELECT * FROM foo WHERE a = 5 OR a = 4
>> isn't equivalent to
>> SELECT * FROM foo WHERE a = 5
>> UNION
>> SELECT * FROM foo WHERE a = 4
>> ?
>
> It probably is, but you're assuming that "a" appears in the list of
> columns being unioned. If you make that just "SELECT b FROM ..."
> then the latter form gets rid of duplicate b values where the first
> doesn't. On the other hand, UNION ALL might introduce duplicates
> not present in the OR query's result.

Yeah. Also, even if you could parse out those cases, it's major
optimization fence. Consider if you have an ORDER BY clause here:

SELECT FROM foo WHERE a OR b ORDER BY c;

... by pushing inside a union, you're going to be in trouble in real
world cases. That's just a mess and it would add a lot of runtime
analysis of the alternative paths. It's hard for me to believe
rewriting is easier and simpler than rewriting 'false OR x' to 'x'. I
also thing that constant folding strategies are going to render much
more sensible output to EXPLAIN.

FYI, The query is something along the lines of
SELECT * FROM foo
WHERE
('a' = 'a' AND EXISTS ...)
OR ('a' = 'b' AND EXISTS ...)
OR ('a' = 'c' AND EXISTS ...)

...where the left side of the equality is a parameterized 'filter
mode' flag. That way the query can introduce filtering behaviors
without doing dynamic acrobatics.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2016-07-01 14:22:42 Re: Reviewing freeze map code
Previous Message Ashutosh Bapat 2016-07-01 14:16:23 Re: Postgres_fdw join pushdown - wrong results with whole-row reference