From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Jeremy Schneider <schnjere(at)amazon(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Krzysztof Plocharz <plocharz(at)9livesdata(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Planning performance problem (67626.278ms) |
Date: | 2019-04-08 23:26:44 |
Message-ID: | 20190408232644.r6xixkeulsmomwfj@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi,
On 2019-04-08 16:10:17 -0700, Jeremy Schneider wrote:
> On 4/8/19 07:42, Justin Pryzby wrote:
> > On Mon, Apr 08, 2019 at 04:33:34PM +0200, Pavel Stehule wrote:
> >> po 8. 4. 2019 v 16:11 odesílatel Krzysztof Plocharz <plocharz(at)9livesdata(dot)com> napsal:
> >>
> >>> We have some very strange query planning problem. Long story short it
> >>> takes 67626.278ms just to plan. Query execution takes 12ms.
> >>>
> >>> Query has 7 joins and 2 subselects.
> >>> It looks like the issue is not deterministic, sometimes is takes few ms
> >>> to plan the query.
> >>>
> >>> One of the tables has 550,485,942 live tuples and 743,504,012 dead
> >>> tuples. Running ANALYZE on that tables solves the problem only temporarily.
> >>>
> >>> Question is how can we debug what is going on?
> >>
> >> please check your indexes against bloating. Planner get min and max from
> >> indexes and this operation is slow on bloat indexes.
> >
> > I think that's from get_actual_variable_range(), right ?
>
> For what it's worth, I have seen a similar issue on Aurora PG 9.6 where
> query planning took a very long time (multiple minutes). In this
> particular case, there wasn't anything Aurora-specific about the call to
> get_actual_variable_range. We weren't able to distinctly identify the
> root cause or build a reproducible test case -- but we suspect that an
> inefficiency might exist in community PostgreSQL code.
>
> For debugging, a few ideas:
>
> 1) capture a stack with pstack or perf record --call-graph
>
> 2) capture the execution plan of the SQL w slow planning
>
> 3) capture detailed stats for all relations and objects involved
>
> 4) capture the usual info for bug reporting (preface section in docs)
>
> A reproducible test case is the gold standard; I'm keeping my eyes open
> for another case too.
>
> For the slow planning case that I saw, the slow process was almost
> entirely in this call stack (captured with perf record --call-graph):
> ...
> index_fetch_heap
> index_getnext
> get_actual_variable_range
> ineq_histogram_selectivity
> scalarineqsel
> mergejoinscansel
> initial_cost_mergejoin
> try_mergejoin_path
> add_paths_to_joinrel
> make_join_rel
> join_search_one_level
> standard_join_search
> make_one_rel
> query_planner
> ...
I suspect some of this might be related to < 11 not having the following
commit:
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Duncan Kinnear | 2019-04-09 21:26:22 | Re: Commit(?) overhead |
Previous Message | Jeremy Schneider | 2019-04-08 23:10:17 | Re: Planning performance problem (67626.278ms) |