Re: Planning performance problem (67626.278ms)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jeremy Schneider <schnjere(at)amazon(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Krzysztof Plocharz <plocharz(at)9livesdata(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Planning performance problem (67626.278ms)
Date: 2019-04-08 23:26:44
Message-ID: 20190408232644.r6xixkeulsmomwfj@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

On 2019-04-08 16:10:17 -0700, Jeremy Schneider wrote:
> On 4/8/19 07:42, Justin Pryzby wrote:
> > On Mon, Apr 08, 2019 at 04:33:34PM +0200, Pavel Stehule wrote:
> >> po 8. 4. 2019 v 16:11 odesílatel Krzysztof Plocharz <plocharz(at)9livesdata(dot)com> napsal:
> >>
> >>> We have some very strange query planning problem. Long story short it
> >>> takes 67626.278ms just to plan. Query execution takes 12ms.
> >>>
> >>> Query has 7 joins and 2 subselects.
> >>> It looks like the issue is not deterministic, sometimes is takes few ms
> >>> to plan the query.
> >>>
> >>> One of the tables has 550,485,942 live tuples and 743,504,012 dead
> >>> tuples. Running ANALYZE on that tables solves the problem only temporarily.
> >>>
> >>> Question is how can we debug what is going on?
> >>
> >> please check your indexes against bloating. Planner get min and max from
> >> indexes and this operation is slow on bloat indexes.
> >
> > I think that's from get_actual_variable_range(), right ?
>
> For what it's worth, I have seen a similar issue on Aurora PG 9.6 where
> query planning took a very long time (multiple minutes). In this
> particular case, there wasn't anything Aurora-specific about the call to
> get_actual_variable_range. We weren't able to distinctly identify the
> root cause or build a reproducible test case -- but we suspect that an
> inefficiency might exist in community PostgreSQL code.
>
> For debugging, a few ideas:
>
> 1) capture a stack with pstack or perf record --call-graph
>
> 2) capture the execution plan of the SQL w slow planning
>
> 3) capture detailed stats for all relations and objects involved
>
> 4) capture the usual info for bug reporting (preface section in docs)
>
> A reproducible test case is the gold standard; I'm keeping my eyes open
> for another case too.
>
> For the slow planning case that I saw, the slow process was almost
> entirely in this call stack (captured with perf record --call-graph):
> ...
> index_fetch_heap
> index_getnext
> get_actual_variable_range
> ineq_histogram_selectivity
> scalarineqsel
> mergejoinscansel
> initial_cost_mergejoin
> try_mergejoin_path
> add_paths_to_joinrel
> make_join_rel
> join_search_one_level
> standard_join_search
> make_one_rel
> query_planner
> ...

I suspect some of this might be related to < 11 not having the following
commit:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3ca930fc39ccf987c1c22fd04a1e7463b5dd0dfd

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Duncan Kinnear 2019-04-09 21:26:22 Re: Commit(?) overhead
Previous Message Jeremy Schneider 2019-04-08 23:10:17 Re: Planning performance problem (67626.278ms)