Poor performance due to parallel seq scan on indexed date field

From: Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
To: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Poor performance due to parallel seq scan on indexed date field
Date: 2023-06-21 18:31:43
Message-ID: CAOC+FBVYuQ9gkEYo_ZOioNEi9HwTETCF5bjpgYfmTXq9LpQf8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Dead simple date scan across a big-ish table (est. 23,153,666 rows)

explain analyze select count(*) from vw_pitches where game_date >=
'2022-06-21' and game_date <= '2023-06-21';

The view does do some joins but those don't seem to be the issue to me.

Planner does:

Finalize Aggregate (cost=3596993.88..3596993.89 rows=1 width=8) (actual
time=69980.491..69982.076 rows=1 loops=1)
-> Gather (cost=3596993.46..3596993.87 rows=4 width=8) (actual
time=69979.137..69982.071 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=3595993.46..3595993.47 rows=1 width=8)
(actual time=69975.136..69975.137 rows=1 loops=5)
-> Nested Loop (cost=0.44..3591408.37 rows=1834034 width=0)
(actual time=0.882..69875.934 rows=1458419 loops=5)
-> Parallel Seq Scan on pitches p
(cost=0.00..3537431.89 rows=1834217 width=12) (actual
time=0.852..68914.256 rows=1458419 loops=5)
Filter: ((game_date >= '2022-06-21'::date) AND
(game_date <= '2023-06-21'::date))
Rows Removed by Filter: 3212310
-> Memoize (cost=0.44..0.47 rows=1 width=4) (actual
time=0.000..0.000 rows=1 loops=7292095)
Cache Key: p.pitcher_identity_id
Cache Mode: logical
Hits: 1438004 Misses: 21042 Evictions: 0
Overflows: 0 Memory Usage: 2138kB
Worker 0: Hits: 1429638 Misses: 21010
Evictions: 0 Overflows: 0 Memory Usage: 2134kB
Worker 1: Hits: 1456755 Misses: 21435
Evictions: 0 Overflows: 0 Memory Usage: 2177kB
Worker 2: Hits: 1433557 Misses: 21201
Evictions: 0 Overflows: 0 Memory Usage: 2154kB
Worker 3: Hits: 1428727 Misses: 20726
Evictions: 0 Overflows: 0 Memory Usage: 2105kB
-> Index Only Scan using identity_pkey on
identity idpitcher (cost=0.43..0.46 rows=1 width=4) (actual
time=0.007..0.007 rows=1 loops=105414)
Index Cond: (identity_id =
p.pitcher_identity_id)
Heap Fetches: 83
Planning Time: 1.407 ms
Execution Time: 69982.927 ms

Is there something to be done here? Kind of a frequent style of query and
quite slow.

--
Wells Oliver
wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Holger Jakobs 2023-06-21 18:37:29 Re: Poor performance due to parallel seq scan on indexed date field
Previous Message Sean O'Grady 2023-06-21 17:32:13 Re: Question about wal_compression and what to expect