Re: BUG #17710: Slow queries (100% CPU) after auto-vacuum

From: Jean-Francois Levesque <jf(dot)levesque(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17710: Slow queries (100% CPU) after auto-vacuum
Date: 2022-12-12 18:36:18
Message-ID: CADYo02geEmYRSo8S_+E5adFu__fh9X+KcXm4RG53iQATM58+Vg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

FYI, I have been able to execute the same queries on a local database
without the transaction that is added by pytest and I don't have the issue.

On Mon, Dec 12, 2022 at 9:37 AM Jean-Francois Levesque <
jf(dot)levesque(at)gmail(dot)com> wrote:

> Hi Tom,
>
> Thanks for the wiki page. Here are the EXPLAIN ANALYZE results on the
> query.
>
> The EXPLAINs have been executed inside the transaction, at the end of the
> test, just before the rollback. Initially I ran them outside the
> transaction but they are pretty clean all the time.
>
> FYI, I also have this issue with v12, v13 and v14. I was unable to test
> with v15 for the moment.
>
> Thanks,
>
> Jean-Francois
>
> ------- v10 -------
>
> v1/shared_models/tests/test_session.py Update on shared_models_session
> (cost=46.88..175.60 rows=16 width=530) (actual time=0.003..0.004 rows=0
> loops=1)
> -> Nested Loop (cost=46.88..175.60 rows=16 width=530) (actual
> time=0.003..0.003 rows=0 loops=1)
> -> HashAggregate (cost=46.60..46.76 rows=16 width=32) (actual
> time=0.003..0.003 rows=0 loops=1)
> Group Key: "ANY_subquery".id
> -> Subquery Scan on "ANY_subquery" (cost=4.79..46.56
> rows=16 width=32) (actual time=0.002..0.002 rows=0 loops=1)
> -> LockRows (cost=4.79..46.40 rows=16 width=22)
> (actual time=0.002..0.002 rows=0 loops=1)
> -> Nested Loop (cost=4.79..46.24 rows=16
> width=22) (actual time=0.002..0.002 rows=0 loops=1)
> -> Nested Loop (cost=4.51..27.86 rows=16
> width=16) (actual time=0.002..0.002 rows=0 loops=1)
> -> Index Scan using
> shared_models_monitoring_owner_id_aedf0b23 on shared_models_monitoring u0
> (cost=0.14..8.16 rows=1 width=10) (actual time=0.002..0.002 rows=0 loops=1)
> Index Cond: (owner_id = 28)
> -> Bitmap Heap Scan on
> shared_models_sessionmonitoring v1 (cost=4.37..19.59 rows=11 width=14)
> (never executed)
> Recheck Cond: (monitoring_id =
> u0.id)
> -> Bitmap Index Scan on
> shared_models_sessionmonitoring_monitoring_id_e9953e3e (cost=0.00..4.36
> rows=11 width=0) (never executed)
> Index Cond:
> (monitoring_id = u0.id)
> -> Index Scan using
> shared_models_session_pkey on shared_models_session v0 (cost=0.28..1.15
> rows=1 width=10) (never executed)
> Index Cond: (id = v1.session_id)
> -> Index Scan using shared_models_session_pkey on
> shared_models_session (cost=0.28..8.05 rows=1 width=453) (never executed)
> Index Cond: (id = "ANY_subquery".id)
> Planning time: 0.116 ms
> Execution time: 0.032 ms
>
>
> ------- v11 - normal speed -------
>
> v1/shared_models/tests/test_session.py Update on shared_models_session
> (cost=45.54..170.25 rows=16 width=530) (actual time=0.006..0.007 rows=0
> loops=1)
> -> Nested Loop (cost=45.54..170.25 rows=16 width=530) (actual
> time=0.006..0.006 rows=0 loops=1)
> -> HashAggregate (cost=45.25..45.41 rows=16 width=32) (actual
> time=0.005..0.006 rows=0 loops=1)
> Group Key: "ANY_subquery".id
> -> Subquery Scan on "ANY_subquery" (cost=4.79..45.21
> rows=16 width=32) (actual time=0.004..0.005 rows=0 loops=1)
> -> LockRows (cost=4.79..45.05 rows=16 width=22)
> (actual time=0.004..0.005 rows=0 loops=1)
> -> Nested Loop (cost=4.79..44.89 rows=16
> width=22) (actual time=0.004..0.004 rows=0 loops=1)
> -> Nested Loop (cost=4.51..27.86 rows=16
> width=16) (actual time=0.004..0.004 rows=0 loops=1)
> -> Index Scan using
> shared_models_monitoring_owner_id_aedf0b23 on shared_models_monitoring u0
> (cost=0.14..8.16 rows=1 width=10) (actual time=0.004..0.004 rows=0 loops=1)
> Index Cond: (owner_id = 2)
> -> Bitmap Heap Scan on
> shared_models_sessionmonitoring v1 (cost=4.37..19.59 rows=11 width=14)
> (never executed)
> Recheck Cond: (monitoring_id =
> u0.id)
> -> Bitmap Index Scan on
> shared_models_sessionmonitoring_monitoring_id_e9953e3e (cost=0.00..4.36
> rows=11 width=0) (never executed)
> Index Cond:
> (monitoring_id = u0.id)
> -> Index Scan using
> shared_models_session_pkey on shared_models_session v0 (cost=0.28..1.06
> rows=1 width=10) (never executed)
> Index Cond: (id = v1.session_id)
> -> Index Scan using shared_models_session_pkey on
> shared_models_session (cost=0.28..7.80 rows=1 width=453) (never executed)
> Index Cond: (id = "ANY_subquery".id)
> Planning Time: 0.162 ms
> Execution Time: 0.061 ms
>
>
> ------- v11 - slow query -------
>
> v1/shared_models/tests/test_session.py Update on shared_models_session
> (cost=0.90..33.02 rows=1 width=530) (actual time=86918.598..86918.600
> rows=0 loops=1)
> -> Nested Loop Semi Join (cost=0.90..33.02 rows=1 width=530) (actual
> time=86918.597..86918.598 rows=0 loops=1)
> Join Filter: (shared_models_session.id = "ANY_subquery".id)
> -> Index Scan using shared_models_session_pkey on
> shared_models_session (cost=0.25..8.26 rows=1 width=453) (actual
> time=0.005..1.762 rows=488 loops=1)
> -> Subquery Scan on "ANY_subquery" (cost=0.65..24.74 rows=1
> width=32) (actual time=178.107..178.107 rows=0 loops=488)
> -> LockRows (cost=0.65..24.73 rows=1 width=22) (actual
> time=178.107..178.107 rows=0 loops=488)
> -> Nested Loop (cost=0.65..24.72 rows=1 width=22)
> (actual time=178.106..178.106 rows=0 loops=488)
> Join Filter: (v1.monitoring_id = u0.id)
> -> Nested Loop (cost=0.50..16.54 rows=1
> width=20) (actual time=0.537..177.918 rows=488 loops=488)
> Join Filter: (v0.id = v1.session_id)
> Rows Removed by Join Filter: 118828
> -> Index Scan using
> shared_models_sessionmonitoring_monitoring_id_e9953e3e on
> shared_models_sessionmonitoring v1 (cost=0.25..8.26 rows=1 width=14)
> (actual time=0.004..0.116 rows=488 loops=488)
> -> Index Scan using
> shared_models_session_pkey on shared_models_session v0 (cost=0.25..8.26
> rows=1 width=10) (actual time=0.002..0.354 rows=244 loops=238144)
> -> Index Scan using
> shared_models_monitoring_owner_id_aedf0b23 on shared_models_monitoring u0
> (cost=0.14..8.16 rows=1 width=10) (actual time=0.000..0.000 rows=0
> loops=238144)
> Index Cond: (owner_id = 6)
> Planning Time: 0.111 ms
> Execution Time: 86918.632 ms
>
>
> On Fri, Dec 9, 2022 at 9:13 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
>> > [ query gets slower after auto-vacuum ]
>>
>> This sounds less like an auto-vacuum problem than an auto-analyze
>> problem. That is, probably a background update of table statistics
>> is happening and that is triggering the planner to pick a different
>> plan. In an ideal world, more up-to-date stats would always result
>> in a better plan, but reality is not always so kind.
>>
>> There is not enough info here to diagnose the problem though.
>> At minimum we need to see EXPLAIN ANALYZE output for the problem
>> query in both the fast and slow states. We have a wiki page about
>> reporting slow-query problems [1] which asks for quite a bit of
>> info, but for starters maybe the EXPLAIN ANALYZE results would
>> be enough.
>>
>> > I am having an issue that started with 11.0 and is still present with
>> 11.18.
>> > The issue is not present with 10.23.
>>
>> If you could get EXPLAIN ANALYZE results from v10 as well as v11,
>> that would be useful too.
>>
>> regards, tom lane
>>
>> [1] https://wiki.postgresql.org/wiki/Slow_Query_Questions
>>
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David G. Johnston 2022-12-12 18:51:19 Re: jsonb subscript operator returns null when key is fetched from table
Previous Message Jeff Davis 2022-12-12 17:03:23 Re: BUG #17714: Refresh materialized view does not update all date