Re: Query optimization advice for beginners

From: Michael Lewis <mlewis(at)entrata(dot)com>
To: Kemal Ortanca <kemal(dot)ortanca(at)outlook(dot)com>
Cc: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Query optimization advice for beginners
Date: 2020-01-27 16:46:18
Message-ID: CAHOFxGoRACkkhag_PvGnUECHnAH1M3+s0WivA8HQ6WsGTX6Yzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

You've got two references to order_basketitemdetail both aliased to bid and
ALSO a table called order_basketitembatch aliased to bib. I assume that
confuses the planner, but even if it doesn't it certainly confuses any new
developers trying to understand the query's intention.

The biggest thing that leaps out at me on the explain plan is the 822
thousand loops on index order_basketitembatch_detail_id_9268ccff. That
seems to be the subquery in the where clause of the subquery in the main
where clause. I never get great results when I nest sub-queries multiple
levels. Without knowing your data, we can only make guesses about
restructuring the query so it performs better.

select bi.id AS basketitem_id --coalesce(sum(bid.quantity), 0)
from order_basketitem bi
--, order_basketitemdetail bid
, order_order o
where o.type in (2,7,9) and o.id = bi.order_id
and o.is_cancelled = false
and bi.is_cancelled = false
and o.is_closed = false
and o.is_picked = false
and o.is_invoiced = false
and o.is_sent = false
--and bi.id = bid.basketitem_id

For a query like the above, how restrictive is it? That is, of ALL the
records in order_basketitem table, how many are returned by the above
condition? I would think that the number of orders that have been picked or
invoiced or sent or closed or cancelled would be LARGE and so this query
may eliminate most of the orders from being considered. Not to mention the
order type id restriction.

If I found that the above query resulted in 1% of the table being returned
perhaps, there are a number of ways to influence the planner to do this
work first such as-

1) put this in a sub-query as the FROM and include OFFSET 0 hack to prevent
in-lining
2) put in a CTE using the WITH keyword (note- need to use MATERIALIZED
option once on PG12 since default behavior changes)
3) if the number of records returned is large (10 thousand maybe?) and the
optimizer is making bad choices on the rest of the query that uses this
result set, put this query into a temp table, analyze it, and then use it.

>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Laurenz Albe 2020-01-28 08:37:42 Re: Query optimization advice for beginners
Previous Message Kemal Ortanca 2020-01-27 15:33:29 Re: Query optimization advice for beginners