CPU Consuming query. Sequential scan despite indexing.

From: aditya desai <admad123(at)gmail(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: CPU Consuming query. Sequential scan despite indexing.
Date: 2020-10-15 15:04:54
Message-ID: CAN0SRDFYi2+C9+LsV9mzmmLjjBt74g_0VR6d8uEaMV2fTnSYxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,
Below query always shows up on top in the CPU matrix. Also despite having
indexes it does sequential scans(probably because WHERE condition satisfies
almost all of the data from table). This query runs on the default landing
page in application and needs to fetch records in less that 100 ms without
consuming too much CPU.

Any opinions? Table is very huge and due to referential identity and
business requirements we could not implement partitioning as well.

There is index on (countrycode,facilitycode,jobstartdatetime)

explain (analyze,buffers) with JobCount as ( select jobstatuscode,count(1)
stat_count from job j where 1=1 and j.countrycode = 'TH' and
j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and ((j.jobstartdatetime between '2020-08-01 00:00:00' and '2020-09-30
00:00:00' ) or j.jobstartdatetime IS NULL ) group by j.jobstatuscode)
select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount jc
right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------
Hash Right Join (cost=98845.93..98846.10 rows=10 width=12) (actual
time=1314.809..1314.849 rows=10 loops=1)
Hash Cond: ((jc.jobstatuscode)::text = (js.jobstatuscode)::text)
Buffers: shared hit=21314 read=3231
I/O Timings: read=19.867
CTE jobcount
-> Finalize GroupAggregate (cost=98842.93..98844.71 rows=7 width=12)
(actual time=1314.780..1314.802 rows=6 loops=1)
Group Key: j.jobstatuscode
Buffers: shared hit=21313 read=3231
I/O Timings: read=19.867
-> Gather Merge (cost=98842.93..98844.57 rows=14 width=12)
(actual time=1314.766..1314.857 rows=18 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=60102 read=11834
I/O Timings: read=59.194
-> Sort (cost=97842.91..97842.93 rows=7 width=12)
(actual time=1305.044..1305.047 rows=6 loops=3)
Sort Key: j.jobstatuscode
Sort Method: quicksort Memory: 25kB
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
Buffers: shared hit=60102 read=11834
I/O Timings: read=59.194
-> Partial HashAggregate (cost=97842.74..97842.81
rows=7 width=12) (actual time=1305.010..1305.013 rows=6 loops=3)
Group Key: j.jobstatuscode
Buffers: shared hit=60086 read=11834
I/O Timings: read=59.194
-> Parallel Seq Scan on job j
(cost=0.00..96837.93 rows=200963 width=4) (actual time=13.010..1144.434
rows=163200 loops=3)
Filter: (((countrycode)::text =
'TH'::text) AND (((jobstartdatetime >= '2020-08-01 00:00:00'::timestamp
without time zone) AND (jobst
artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR
(jobstartdatetime IS NULL)) AND ((facilitycode)::text = ANY
('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
Rows Removed by Filter: 449035
Buffers: shared hit=60086 read=11834
I/O Timings: read=59.194
-> CTE Scan on jobcount jc (cost=0.00..0.14 rows=7 width=24) (actual
time=1314.784..1314.811 rows=6 loops=1)
Buffers: shared hit=21313 read=3231
I/O Timings: read=19.867
-> Hash (cost=1.10..1.10 rows=10 width=4) (actual time=0.014..0.015
rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on jobstatus js (cost=0.00..1.10 rows=10 width=4)
(actual time=0.005..0.008 rows=10 loops=1)
Buffers: shared hit=1
Planning Time: 0.949 ms
Execution Time: 1314.993 ms
(40 rows)

Regards,
Aditya.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Michael Lewis 2020-10-15 16:26:35 Re: CPU Consuming query. Sequential scan despite indexing.
Previous Message Tom Lane 2020-10-15 13:56:12 Re: Poor Performance running Django unit tests after upgrading from 10.6