Quick Links

Re: How to improve the performance of my SQL query？

From:	gzh <gzhcoder(at)126(dot)com>
To:	"jian he" <jian(dot)universality(at)gmail(dot)com>
Cc:	"Laurenz Albe" <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: How to improve the performance of my SQL query？
Date:	2023-07-25 03:58:42
Message-ID:	77fe4763.2f4d.1898b32fdd2.Coremail.gzhcoder@126.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Thank you for your reply.

>I think the whole query can just:
>select COUNT(ET_CD)
>from TBL_SHA
>WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';
This is related to the business logic.

>if many duplicates rows returned, then there is no point of evaluate
>something like {1 in (1,1,1,1,1,1,)}
Because the primary key of the 'tbl_inf' table only consists of 'ms_cd' and 'ry_cd' columns, the subquery will not return duplicate rows.

At 2023-07-24 22:42:01, "jian he" <jian(dot)universality(at)gmail(dot)com> wrote:
>On Mon, Jul 24, 2023 at 5:54 PM gzh <gzhcoder(at)126(dot)com> wrote:
>>
>> >Did you change any parameters that have an impact on query planning?
>>
>> >You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
>>
>> I added some parameters and re-executed the Execution Plan.
>>
>> Except for the index not taking effect, I still don't know the reason why the index is not working.
>>
>> Is it because there is too much data that meets the conditions?
>>
>>
>> EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
>>
>> select COUNT(ET_CD)
>>
>> from TBL_SHA
>>
>> WHERE TBL_SHA.MS_CD = 'MLD009'
>>
>> and TBL_SHA.ETRYS in
>>
>> (select TBL_INF.RY_CD
>>
>> from TBL_INF
>>
>> WHERE TBL_INF.MS_CD = 'MLD009'
>>
>> AND TBL_INF.RY_CD = '00000001'
>>
>> )
>>
>>
>> ----- Execution Plan -----
>>
>> Limit (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1)
>>
>> Output: (count(tbl_sha.et_cd))
>>
>> Buffers: shared hit=58948 read=2112758
>>
>> I/O Timings: read=357249.120
>>
>> -> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1)
>>
>> Output: count(tbl_sha.et_cd)
>>
>> Buffers: shared hit=58948 read=2112758
>>
>> I/O Timings: read=357249.120
>>
>> -> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1)
>>
>> Output: tbl_sha.et_cd
>>
>> Buffers: shared hit=58948 read=2112758
>>
>> I/O Timings: read=357249.120
>>
>> -> Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1)
>>
>> Output: tbl_inf.ms_cd, tbl_inf.ry_cd
>>
>> Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar))
>>
>> Heap Fetches: 1
>>
>> Buffers: shared hit=4
>>
>> -> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1)
>>
>> Output: tbl_sha.et_cd, tbl_sha.etrys
>>
>> Workers Planned: 2
>>
>> Workers Launched: 2
>>
>> Buffers: shared hit=58944 read=2112758
>>
>> I/O Timings: read=357249.120
>>
>> -> Parallel Seq Scan on mtpdb.tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3)
>>
>> Output: tbl_sha.et_cd, tbl_sha.etrys
>>
>> Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar))
>>
>> Rows Removed by Filter: 14062278
>>
>> Buffers: shared hit=58944 read=2112758
>>
>> I/O Timings: read=357249.120
>>
>> Worker 0: actual time=1432.292..127762.181 rows=988036 loops=1
>>
>> Buffers: shared hit=17875 read=706862
>>
>> I/O Timings: read=119193.744
>>
>> Worker 1: actual time=1425.878..127786.777 rows=992381 loops=1
>>
>> Buffers: shared hit=19813 read=706359
>>
>> I/O Timings: read=119386.899
>>
>> Planning:
>>
>> Buffers: shared hit=42
>>
>> Planning Time: 1.024 ms
>>
>> Execution Time: 128717.731 ms
>>
>>
>
>I think the whole query can just:
>select COUNT(ET_CD)
>from TBL_SHA
>WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';
>
>> and TBL_SHA.ETRYS in
>> (select TBL_INF.RY_CD
>> from TBL_INF
>> WHERE TBL_INF.MS_CD = 'MLD009'
>> AND TBL_INF.RY_CD = '00000001'
>> )
>
>if subquery after IN clause part, no rows returned then the whole
>query would return zero row.
>if many duplicates rows returned, then there is no point of evaluate
>something like {1 in (1,1,1,1,1,1,)}

In response to

Re: How to improve the performance of my SQL query？ at 2023-07-24 14:42:01 from jian he

Browse pgsql-general by date

	From	Date	Subject
Next Message	Laurenz Albe	2023-07-25 05:56:25	Re: How to improve the performance of my SQL query？
Previous Message	gzh	2023-07-25 03:11:13	Re: How to improve the performance of my SQL query？