From: | Jeremy Schneider <schnjere(at)amazon(dot)com> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | Krzysztof Plocharz <plocharz(at)9livesdata(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Planning performance problem (67626.278ms) |
Date: | 2019-04-08 23:10:17 |
Message-ID: | b145b428-3351-ec86-d522-100a5bf8ded7@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 4/8/19 07:42, Justin Pryzby wrote:
> On Mon, Apr 08, 2019 at 04:33:34PM +0200, Pavel Stehule wrote:
>> po 8. 4. 2019 v 16:11 odesílatel Krzysztof Plocharz <plocharz(at)9livesdata(dot)com> napsal:
>>
>>> We have some very strange query planning problem. Long story short it
>>> takes 67626.278ms just to plan. Query execution takes 12ms.
>>>
>>> Query has 7 joins and 2 subselects.
>>> It looks like the issue is not deterministic, sometimes is takes few ms
>>> to plan the query.
>>>
>>> One of the tables has 550,485,942 live tuples and 743,504,012 dead
>>> tuples. Running ANALYZE on that tables solves the problem only temporarily.
>>>
>>> Question is how can we debug what is going on?
>>
>> please check your indexes against bloating. Planner get min and max from
>> indexes and this operation is slow on bloat indexes.
>
> I think that's from get_actual_variable_range(), right ?
For what it's worth, I have seen a similar issue on Aurora PG 9.6 where
query planning took a very long time (multiple minutes). In this
particular case, there wasn't anything Aurora-specific about the call to
get_actual_variable_range. We weren't able to distinctly identify the
root cause or build a reproducible test case -- but we suspect that an
inefficiency might exist in community PostgreSQL code.
For debugging, a few ideas:
1) capture a stack with pstack or perf record --call-graph
2) capture the execution plan of the SQL w slow planning
3) capture detailed stats for all relations and objects involved
4) capture the usual info for bug reporting (preface section in docs)
A reproducible test case is the gold standard; I'm keeping my eyes open
for another case too.
For the slow planning case that I saw, the slow process was almost
entirely in this call stack (captured with perf record --call-graph):
...
index_fetch_heap
index_getnext
get_actual_variable_range
ineq_histogram_selectivity
scalarineqsel
mergejoinscansel
initial_cost_mergejoin
try_mergejoin_path
add_paths_to_joinrel
make_join_rel
join_search_one_level
standard_join_search
make_one_rel
query_planner
...
-Jeremy
--
Jeremy Schneider
Database Engineer
Amazon Web Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2019-04-08 23:26:44 | Re: Planning performance problem (67626.278ms) |
Previous Message | David G. Johnston | 2019-04-08 22:42:57 | Re: Oracle to postgres migration |