Re: How to improve the performance of my SQL query?

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: gzh <gzhcoder(at)126(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: How to improve the performance of my SQL query?
Date: 2023-07-24 14:42:01
Message-ID: CACJufxEDxWZO9HbZ8Vqj-BncvqX3y3SqLr+0b23ekj=GkTpAUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Jul 24, 2023 at 5:54 PM gzh <gzhcoder(at)126(dot)com> wrote:
>
> >Did you change any parameters that have an impact on query planning?
>
> >You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
>
> I added some parameters and re-executed the Execution Plan.
>
> Except for the index not taking effect, I still don't know the reason why the index is not working.
>
> Is it because there is too much data that meets the conditions?
>
>
> EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
>
> select COUNT(ET_CD)
>
> from TBL_SHA
>
> WHERE TBL_SHA.MS_CD = 'MLD009'
>
> and TBL_SHA.ETRYS in
>
> (select TBL_INF.RY_CD
>
> from TBL_INF
>
> WHERE TBL_INF.MS_CD = 'MLD009'
>
> AND TBL_INF.RY_CD = '00000001'
>
> )
>
>
> ----- Execution Plan -----
>
> Limit (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1)
>
> Output: (count(tbl_sha.et_cd))
>
> Buffers: shared hit=58948 read=2112758
>
> I/O Timings: read=357249.120
>
> -> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1)
>
> Output: count(tbl_sha.et_cd)
>
> Buffers: shared hit=58948 read=2112758
>
> I/O Timings: read=357249.120
>
> -> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1)
>
> Output: tbl_sha.et_cd
>
> Buffers: shared hit=58948 read=2112758
>
> I/O Timings: read=357249.120
>
> -> Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1)
>
> Output: tbl_inf.ms_cd, tbl_inf.ry_cd
>
> Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar))
>
> Heap Fetches: 1
>
> Buffers: shared hit=4
>
> -> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1)
>
> Output: tbl_sha.et_cd, tbl_sha.etrys
>
> Workers Planned: 2
>
> Workers Launched: 2
>
> Buffers: shared hit=58944 read=2112758
>
> I/O Timings: read=357249.120
>
> -> Parallel Seq Scan on mtpdb.tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3)
>
> Output: tbl_sha.et_cd, tbl_sha.etrys
>
> Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar))
>
> Rows Removed by Filter: 14062278
>
> Buffers: shared hit=58944 read=2112758
>
> I/O Timings: read=357249.120
>
> Worker 0: actual time=1432.292..127762.181 rows=988036 loops=1
>
> Buffers: shared hit=17875 read=706862
>
> I/O Timings: read=119193.744
>
> Worker 1: actual time=1425.878..127786.777 rows=992381 loops=1
>
> Buffers: shared hit=19813 read=706359
>
> I/O Timings: read=119386.899
>
> Planning:
>
> Buffers: shared hit=42
>
> Planning Time: 1.024 ms
>
> Execution Time: 128717.731 ms
>
>

I think the whole query can just:
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';

> and TBL_SHA.ETRYS in
> (select TBL_INF.RY_CD
> from TBL_INF
> WHERE TBL_INF.MS_CD = 'MLD009'
> AND TBL_INF.RY_CD = '00000001'
> )

if subquery after IN clause part, no rows returned then the whole
query would return zero row.
if many duplicates rows returned, then there is no point of evaluate
something like {1 in (1,1,1,1,1,1,)}

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron 2023-07-24 14:51:52 Re: Grant all privileges to user on a database
Previous Message Alvaro Herrera 2023-07-24 14:17:59 Re: \d don't print all the tables