Re: Poor performance due to parallel seq scan on indexed date field

From: Holger Jakobs <holger(at)jakobs(dot)com>
To: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Poor performance due to parallel seq scan on indexed date field
Date: 2023-06-21 18:37:29
Message-ID: 85548e4d-b7ce-e061-e43e-8a8e332dc4e5@jakobs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin


Am 21.06.23 um 20:31 schrieb Wells Oliver:
> Dead simple date scan across a big-ish table (est. 23,153,666 rows)
>
> explain analyze select count(*) from vw_pitches where game_date >=
> '2022-06-21' and game_date <= '2023-06-21';
>
> The view does do some joins but those don't seem to be the issue to me.
>
> Planner does:
>
> Finalize Aggregate  (cost=3596993.88..3596993.89 rows=1 width=8)
> (actual time=69980.491..69982.076 rows=1 loops=1)
>   ->  Gather  (cost=3596993.46..3596993.87 rows=4 width=8) (actual
> time=69979.137..69982.071 rows=5 loops=1)
>         Workers Planned: 4
>         Workers Launched: 4
>         ->  Partial Aggregate  (cost=3595993.46..3595993.47 rows=1
> width=8) (actual time=69975.136..69975.137 rows=1 loops=5)
>               ->  Nested Loop  (cost=0.44..3591408.37 rows=1834034
> width=0) (actual time=0.882..69875.934 rows=1458419 loops=5)
>                     ->  Parallel Seq Scan on pitches p
>  (cost=0.00..3537431.89 rows=1834217 width=12) (actual
> time=0.852..68914.256 rows=1458419 loops=5)
>                           Filter: ((game_date >= '2022-06-21'::date)
> AND (game_date <= '2023-06-21'::date))
>                           Rows Removed by Filter: 3212310
>                     ->  Memoize  (cost=0.44..0.47 rows=1 width=4)
> (actual time=0.000..0.000 rows=1 loops=7292095)
>                           Cache Key: p.pitcher_identity_id
>                           Cache Mode: logical
>                           Hits: 1438004  Misses: 21042  Evictions: 0
>  Overflows: 0  Memory Usage: 2138kB
>                           Worker 0:  Hits: 1429638  Misses: 21010
>  Evictions: 0  Overflows: 0  Memory Usage: 2134kB
>                           Worker 1:  Hits: 1456755  Misses: 21435
>  Evictions: 0  Overflows: 0  Memory Usage: 2177kB
>                           Worker 2:  Hits: 1433557  Misses: 21201
>  Evictions: 0  Overflows: 0  Memory Usage: 2154kB
>                           Worker 3:  Hits: 1428727  Misses: 20726
>  Evictions: 0  Overflows: 0  Memory Usage: 2105kB
>                           ->  Index Only Scan using identity_pkey on
> identity idpitcher  (cost=0.43..0.46 rows=1 width=4) (actual
> time=0.007..0.007 rows=1 loops=105414)
>                                 Index Cond: (identity_id =
> p.pitcher_identity_id)
>                                 Heap Fetches: 83
> Planning Time: 1.407 ms
> Execution Time: 69982.927 ms
>
> Is there something to be done here? Kind of a frequent style of query
> and quite slow.
>
Could you provide the definition of the view(s) down to the base tables?

--
Holger Jakobs, Bergisch Gladbach, Tel. +49-178-9759012

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Wells Oliver 2023-06-21 18:39:39 Re: Poor performance due to parallel seq scan on indexed date field
Previous Message Wells Oliver 2023-06-21 18:31:43 Poor performance due to parallel seq scan on indexed date field