Re: OLAP/reporting queries fall into nested loops over seq scans or other horrible planner choices

From: Gunther <raj(at)gusw(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: OLAP/reporting queries fall into nested loops over seq scans or other horrible planner choices
Date: 2017-11-02 16:36:13
Message-ID: c3cf6a5c-d622-3b39-36b6-a37da59f3760@gusw.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks for your support Laurent.

I have an idea on one thing you said:

> Just adding to your voice. I recently experienced the same issue with a complex multi-table view, including pivots, and was surprised to see all the nested loops everywhere
and here is the clue for me:
> in spite of indices being available.
I would say that sometimes indexes are detrimental. If you don't need
them for other reasons, you might want to not have them. And without the
index, the Nested Loop strategy might not be chosen.

But that is a side-issue, because it can often not be avoided. Just
saying in case it might help.

I also found the opposite now. In the query that made me "blow the lid"
and "complain" here, my team decided to add an index and that did not
get rid of Nested Loops but at least made the inner table access indexed
rather than a table scan and the performance ended up OK. But it's not
always predictable, and these indexes could trap the planner into
sub-optimal solutions still.

I think there is an opportunity for a PgSQL query plan extension,
especially wen dealing with CTE (WITH-clauses), PgSQL could make them a
temporary table and add indexes that it needs for it on the fly, because
after it has done one pass over the inner loop sequential scan it knows
perfectly well how many rows it has, and knowing how many more
iterations are coming from the sub-query that's driving the Nested Loop,
it could decide that it's much faster to put an index on the nested
relation, temporarily materialized. Or it could even decide to change
it's plan mid-way and do the Hash Join.

This is why I had always dreamed that the PgSQL optimizer had some easy
API where one could plug in experimental strategies. I personally am
extremely efficient with XSLT for complex intelligent algorithms, and I
dream of a PgSQL query plan structure exposed as XML which an XSLT
plugin could then process to edit the plan. People could experiment with
awesome intelligent new strategies based on statistics gathered along
the way of the execution.

regards,
-Gunther

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Gunther 2017-11-02 17:14:29 Re: OLAP/reporting queries fall into nested loops over seq scans or other horrible planner choices
Previous Message Laurenz Albe 2017-11-02 08:30:28 Re: OLAP/reporting queries fall into nested loops over seq scans or other horrible planner choices