Re: Non-trivial condition is only propagated to one side of JOIN

From: Tobias Hoffmann <ldev-list(at)thax(dot)hardliners(dot)org>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Non-trivial condition is only propagated to one side of JOIN
Date: 2024-08-25 17:04:32
Message-ID: 13ebf00f-2991-43c5-5d11-a8216cb3217a@thax.hardliners.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 25/08/2024 17:35, David G. Johnston wrote:
> On Sunday, August 25, 2024, Tobias Hoffmann
> <ldev-list(at)thax(dot)hardliners(dot)org> wrote:
>
>
> 3) Problematic example:
>
> # EXPLAIN SELECT * FROM tbl1 JOIN tbl2 ON tbl2.site_id =
> tbl1.site_id WHERE tbl1.site_id = 1 OR tbl1.site_id IS NULL;
>
>
> The “is null” predicate in this query is doing nothing as your next
> comment alludes to; you will produce no rows out of the join with a
> null site_id due to the use of the equals operator in the join.

Well, that's why I said: "keep in mind that this example is as
simplified as possible"...
Even though `tbl1.site_id = 1` is – in this case – completely equivalent
to `tbl1.site_id = 1 OR tbl1.site_id IS NULL` – the first one is
completely pushed down, but the second is not.

A more complete example might look more like this:

CREATE VIEW "subview1" AS
  SELECT tbl1.site_id, ... JOIN ... ON tbl1.site_id = tbl2.site_id
WHERE ...;

CREATE VIEW "view1" AS
  SELECT site_id, ... FROM subview1  -- maybe even: WHERE site_id IS
NOT NULL
  UNION ALL
  SELECT null, ...;

SELECT * FROM view1 WHERE (site_id = 1 OR site_id IS NULL);

The reason, why the outer query would have a more complicated condition
might have nothing to do with the subquery containing the JOIN.
(This is also not a `UNION ALL` special case: `site_id IS NULL` could
also be generated by a LEFT JOIN, e.g.)

But not pushing down the condition has the grave impact, that those -
otherwise working VIEWs (i.e. subview1) become "unfixably" broken, for
certain WHERE-conditions on the outside.

Another reason why I said `site_id INTEGER NOT NULL` and `IS NOT
DISTINCT FROM`, is that there might be some mathematical reason I'm not
yet aware of where the propagation would not be sound, but which would
not apply for arbitrary site_id [nullable, ...].
My primary goal is to find at least *some* way to get the condition
pushed further in to avoid the full table scan, and to not have to
completely rewrite all the VIEWs into a single big query, where I could
inject the site_id parameter (e.g. "$1") in multiple places as needed...

  Tobias

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Sabino Mullane 2024-08-25 17:14:36 Re: Better error message when --single is not the first arg to postgres executable
Previous Message Tom Lane 2024-08-25 16:52:49 Re: Non-trivial condition is only propagated to one side of JOIN