Re: Is there any good optimization solution to improve the query efficiency?

From: Oliver Kohll <oliver(at)agilebase(dot)co(dot)uk>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Is there any good optimization solution to improve the query efficiency?
Date: 2023-06-05 07:45:50
Message-ID: CAMS=m5LtOO0NscRYr_7y0HVUw9T+LGog-RvV+T9p-_o=jrRYCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 5 Jun 2023 at 07:56, gzh <gzhcoder(at)126(dot)com> wrote:

> Hi everyone,
> I'm running into some performance issues with my SQL query.
> The following SQL query is taking a long time to execute.
>
> Execution Plan:
> explain analyse
> select * from TBL_RES
> left outer join(select T_CUST.RSNO RSNO2 ,
> T_CUST.KNO ,
> T_CUST.AGE ,
> T_CUST.GST
> from TBL_CUST T_CUST ,
> (select T_CUST.RSNO ,
> T_CUST.KNO ,
> MIN(T_CUST.GSTSEQ) GSTSEQ
> from TBL_CUST T_CUST ,
> TBL_POV T_POV ,
> TBL_RES T_RES
> where T_CUST.STSFLG = 'T'
> and T_CUST.DISPSEQ <> 9999
> AND T_CUST.KFIX = '0'
> and T_POV.CRSNO = T_RES.CRSNO
> and T_RES.RSNO = T_CUST.RSNO
> group by T_CUST.RSNO , T_CUST.KNO) T_POV2
> where T_POV2.RSNO = T_CUST.RSNO
> and T_POV2.KNO = T_CUST.KNO
> and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2
> and TBL_RES.KNO = T_POV3.KNO
> where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
> and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
> and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
> and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
> ----- Execution Plan -----
> Nested Loop Left Join (cost=254388.44..452544.70 rows=473 width=3545)
> (actual time=3077.312..996048.714 rows=15123 loops=1)
> Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text =
> (T_CUST.KNO)::text))
> Rows Removed by Join Filter: 4992268642
> -> Gather (cost=1000.00..58424.35 rows=473 width=3489) (actual
> time=0.684..14.158 rows=15123 loops=1)
> Workers Planned: 2
> Workers Launched: 2
> -> Parallel Seq Scan on TBL_RES (cost=0.00..57377.05 rows=197
> width=3489) (actual time=0.096..279.504 rows=5041 loops=3)
> Filter: ((CID >= to_date('2022/07/01'::text,
> 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text,
> 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text,
> 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text,
> 'YYYY/MM/DD'::text)))
> Rows Removed by Filter: 161714
> -> Materialize (cost=253388.44..394112.08 rows=1 width=56) (actual
> time=0.081..26.426 rows=330111 loops=15123)
> -> Hash Join (cost=253388.44..394112.07 rows=1 width=56) (actual
> time=1197.484..2954.084 rows=330111 loops=1)
> Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND
> ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq =
> (min(T_CUST_1.gstseq))))
> -> Seq Scan on TBL_CUST T_CUST (cost=0.00..79431.15
> rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)
> -> Hash (cost=246230.90..246230.90 rows=262488 width=50)
> (actual time=1197.025..1209.957 rows=330111 loops=1)
> Buckets: 65536 Batches: 8 Memory Usage: 2773kB
> -> Finalize GroupAggregate
> (cost=205244.84..243606.02 rows=262488 width=50) (actual
> time=788.552..1116.074 rows=330111 loops=1)
> Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
> -> Gather Merge (cost=205244.84..238964.80
> rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)
> Workers Planned: 2
> Workers Launched: 1
> -> Partial GroupAggregate
> (cost=204244.81..206933.27 rows=134423 width=50) (actual
> time=784.032..900.979 rows=165056 loops=2)
> Group Key: T_CUST_1.RSNO,
> T_CUST_1.KNO
> -> Sort (cost=204244.81..204580.87
> rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)
> Sort Key: T_CUST_1.RSNO,
> T_CUST_1.KNO
> Sort Method: external merge
> Disk: 5480kB
> Worker 0: Sort Method:
> external merge Disk: 5520kB
> -> Parallel Hash Join
> (cost=111758.80..190036.38 rows=134423 width=23) (actual
> time=645.302..716.247 rows=165061 loops=2)
> Hash Cond:
> (T_CUST_1.RSNO = T_RES.RSNO)
> -> Parallel Seq Scan on
> TBL_CUST T_CUST_1 (cost=0.00..74013.63 rows=204760 width=23) (actual
> time=0.018..264.390 rows=165058 loops=2)
> Filter: ((dispseq
> <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text =
> '0'::text))
> Rows Removed by
> Filter: 835318
> -> Parallel Hash
> (cost=109508.52..109508.52 rows=137142 width=8) (actual
> time=343.593..343.896 rows=165058 loops=2)
> Buckets: 131072
> Batches: 8 Memory Usage: 3008kB
> -> Parallel Hash
> Join (cost=51834.70..109508.52 rows=137142 width=8) (actual
> time=256.732..314.368 rows=165058 loops=2)
> Hash Cond:
> ((T_RES.crsno)::text = (T_POV.crsno)::text)
> -> Parallel
> Seq Scan on TBL_RES T_RES (cost=0.00..53199.02 rows=208902 width=17)
> (actual time=0.007..100.510 rows=250132 loops=2)
> -> Parallel
> Hash (cost=49450.42..49450.42 rows=137142 width=9) (actual
> time=122.308..122.309 rows=165054 loops=2)
>
> Buckets: 131072 Batches: 8 Memory Usage: 2976kB
> ->
> Parallel Seq Scan on TBL_POV T_POV (cost=0.00..49450.42 rows=137142
> width=9) (actual time=0.037..89.470 rows=165054 loops=2)
> Planning Time: 1.064 ms
> Execution Time: 996062.382 ms
>
> --------------------------------------------------------------------------------
>
> The amount of data in the table is as follows.
> TBL_RES 500265
> TBL_CUST 2000752
> TBL_POV 330109
>
> Any suggestions for improving the performance of the query would be
> greatly appreciated.
>
> Thanks in advance!
>
>
Welcome to query optimisation, I hope you'll enjoy working on this problem,
it's one of the things I and many others love.

From a quick glance, it looks like there are opportunities for index use
there but the plan doesn't contain any. Do any indexes exist?

You could also start looking at it in parts, e.g. separate out the
inner T_POV2 subselect. It looks like that contains a cross join, which is
then whittled down with WHERE clauses. You could try adding ON to the FROM
instead. https://www.postgresql.org/docs/15/sql-select.html#SQL-FROM

Oliver

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ron 2023-06-05 07:59:09 Re: Is there any good optimization solution to improve the query efficiency?
Previous Message Laurenz Albe 2023-06-05 07:27:12 Re: [Question]What will happen if the server active close the connection?