Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause

From: Igor Neyman <ineyman(at)perceptron(dot)com>
To: Igor Neyman <ineyman(at)perceptron(dot)com>, "Sven R(dot) Kunze" <srkunze(at)mail(dot)de>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause
Date: 2016-09-22 14:39:44
Message-ID: CY4PR07MB2872AB33EDC46AF17F248D75DAC90@CY4PR07MB2872.namprd07.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Igor Neyman
Sent: Thursday, September 22, 2016 10:36 AM
To: Sven R. Kunze <srkunze(at)mail(dot)de>; pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Multiple-Table-Spanning Joins with ORs in WHERE Clause

-----Original Message-----
From: Igor Neyman
Sent: Thursday, September 22, 2016 10:33 AM
To: 'Sven R. Kunze' <srkunze(at)mail(dot)de>; pgsql-performance(at)postgresql(dot)org
Subject: RE: [PERFORM] Multiple-Table-Spanning Joins with ORs in WHERE Clause

-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Sven R. Kunze
Sent: Thursday, September 22, 2016 9:25 AM
To: pgsql-performance(at)postgresql(dot)org
Subject: [PERFORM] Multiple-Table-Spanning Joins with ORs in WHERE Clause

Hi pgsql-performance list,

what is the recommended way of doing **multiple-table-spanning joins with ORs in the WHERE-clause**?

Until now, we've used the LEFT OUTER JOIN to filter big_table like so:

SELECT DISTINCT <fields of big_table>
FROM
"big_table"
LEFT OUTER JOIN "table_a" ON ("big_table"."id" =
"table_a"."big_table_id")
LEFT OUTER JOIN "table_b" ON ("big_table"."id" =
"table_b"."big_table_id")
WHERE
"table_a"."item_id" IN (<handful of items>)
OR
"table_b"."item_id" IN (<handful of items>);

However, this results in an awful slow plan (requiring to scan the complete big_table which obviously isn't optimal).
So, we decided (at least for now) to split up the query into two separate ones and merge/de-duplicate the result with application logic:

SELECT <fields of big_table>
FROM
"big_table" INNER JOIN "table_a" ON ("big_table"."id" =
"table_a"."big_table_id")
WHERE
"table_a"."item_id" IN (<handful of items>);

SELECT <fields of big_table>
FROM
"big_table" INNER JOIN "table_b" ON ("big_table"."id" =
"table_b"."big_table_id")
WHERE
"table_b"."item_id" IN (<handful of items>);

As you can imagine we would be very glad to solve this issue with a
single query and without having to re-code existing logic of PostgreSQL.
But how?

Best,
Sven

PS: if you require EXPLAIN ANALYZE, I can post them as well.

______________________________________________________________________________________________

Another option to try::

SELECT DISTINCT <fields of big_table>
FROM
"big_table"
LEFT OUTER JOIN "table_a" ON ("big_table"."id" = "table_a"."big_table_id" AND "table_a"."item_id" IN (<handful of items>))
LEFT OUTER JOIN "table_b" ON ("big_table"."id" = "table_b"."big_table_id" AND "table_b"."item_id" IN (<handful of items>));

Regards,
Igor Neyman

_______________________________________________________________________________________________________

Please disregard this last suggestion, it'll not produce required results.

Solution using UNION should work.

Regards,
Igor Neyman

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2016-09-22 16:20:38 Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause
Previous Message Igor Neyman 2016-09-22 14:36:24 Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause