Re: Query slow again after adding an `OR` operation (was: Slow PostgreSQL 10.6 query)

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Behrang Saeedzadeh <behrangsa(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Query slow again after adding an `OR` operation (was: Slow PostgreSQL 10.6 query)
Date: 2019-10-09 22:06:25
Message-ID: CAKJS1f8M-TPB86=hqGspDA0mD_pP=LWGRJA0u393xtRzPzUeKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 10 Oct 2019 at 01:07, Behrang Saeedzadeh <behrangsa(at)gmail(dot)com> wrote:
>
> This is a follow up to https://www.postgresql.org/message-id/flat/CAERAJ%2B-1buiJ%2B_JWEo0a9Ao-CVMWpgp%3DEnFx1dJtnB3WmMi2zQ%40mail.gmail.com
>
> The query (generated by Hibernate) got a bit more complex and performance degraded again. I have uploaded all the details here (with changed table names, etc.): https://github.com/behrangsa/slow-query
>
> In short, the new query is:

The query mostly appears slow due to the "Rows Removed By Filter" in
the OR condition. The only way to get around not scanning the entire
branch_invoices table would be to somehow write the way in such a way
that allows it to go on the inner side of the join.

You could do that if you ensure there's an index on branch_invoices
(branch_id) and format the query as:

SELECT inv.id AS i_id,
inv.invoice_date AS inv_d,
inv.invoice_xid AS inv_xid,
inv.invoice_type AS inv_type,
brs.branch_id AS br_id,
cinvs.company_id AS c_id
FROM invoices inv
LEFT OUTER JOIN branch_invoices brs ON inv.id = brs.invoice_id
LEFT OUTER JOIN company_invoices cinvs ON inv.id = cinvs.invoice_id
INNER JOIN branches br ON brs.branch_id = br.id
WHERE brs.branch_id IN (SELECT br1.id
FROM branches br1
INNER JOIN access_rights ar1 ON
br1.id = ar1.branch_id
INNER JOIN users usr1 ON ar1.user_id = usr1.id
INNER JOIN groups grp1 ON
ar1.group_id = grp1.id
INNER JOIN group_permissions gpr1 ON
grp1.id = gpr1.group_id
INNER JOIN permissions prm1 ON
gpr1.permission_id = prm1.id
WHERE usr1.id = 1636
AND prm1.code = 'C2'
AND ar1.access_type = 'T1')
UNION ALL
SELECT br3.id
FROM companies cmp
INNER JOIN branches br3 ON cmp.id =
br3.company_id
INNER JOIN access_rights ar2 ON
cmp.id = ar2.company_id
INNER JOIN users usr2 ON ar2.user_id = usr2.id
INNER JOIN groups g2 ON ar2.group_id = g2.id
INNER JOIN group_permissions gpr2 ON
g2.id = gpr2.group_id
INNER JOIN permissions prm2 ON
gpr2.permission_id = prm2.id
WHERE usr2.id = 1636
AND prm2.code = 'C2'
AND ar2.access_type = 'T1')
ORDER BY inv.invoice_date DESC, br.name ASC
LIMIT 12;

The planner may then choose to pullup the subquery and uniquify it
then put it on the outside of a nested loop join then lookup the
branch_invoices record using the index on branch_id. I think this is
quite a likely plan since the planner estimates there's only going to
be 1 row from each of the subqueries.

Also note, that the LEFT JOIN you have to branch_invoices is not
really a left join since you're insisting that the branch_id must be
in the first or 2nd sub-plan. There's no room for it to be NULL. The
planner will just convert that to an INNER JOIN with the above query
since that'll give it the flexibility to put the subquery in the IN
clause on the outside of the join (after having uniquified it).
You'll need to decide what you actually want the behaviour to be here.
If you do need those NULL rows then you'd better move your WHERE quals
down into the join condition for branch_invoices table. I'd suggest
testing with some mock-up data if you're uncertain of what I mean.

If you find that is faster and you can't rewrite the query due to it
having been generated by Hibernate, then that sounds like a problem
with Hibernate. PostgreSQL does not currently attempt to do any
rewrites which convert OR clauses to use UNION or UNION ALL. No amount
of tweaking the planner settings is going to change that fact.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jerry Sievers 2019-10-09 23:05:20 Re: Modification of data in base folder and very large tables
Previous Message Jeff Janes 2019-10-09 19:54:57 Re: Would SSD improve Index Only Scan performance by a lot?