From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | "Kneringer, Armin" <Armin(dot)Kneringer(at)fabasoft(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: potential performance gain by query planner optimization |
Date: | 2010-07-20 19:39:04 |
Message-ID: | AANLkTilwPLJssKY8TMc5fjYTTOPQsVsHSEgrjj_46MSM@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello
2010/7/20 Kneringer, Armin <Armin(dot)Kneringer(at)fabasoft(dot)com>:
> Hi there.
>
> I think I found a potential performance gain if the query planner would be optimized. All Tests has been performed with 8.4.1 (and earlier versions) on CentOS 5.3 (x64)
>
> The following query will run on my database (~250 GB) for ca. 1600 seconds and the sort will result in a disk merge deploying ca. 200 GB of data to the local disk (ca. 180.000 tmp-files)
can you try show check explain with set enable_hashjoin to off; ?
Regards
Pavel Stehule
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> NOT EXISTS (
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t4.objid AND
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = t4.objid AND
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
>
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=2592137103.99..2592137104.00 rows=1 width=8)
> -> Sort (cost=2592137103.99..2592137104.00 rows=1 width=8)
> Sort Key: t4.objid
> -> Nested Loop (cost=1105592553.38..2592137103.98 rows=1 width=8)
> -> Hash Anti Join (cost=1105592553.38..2592137095.75 rows=1 width=8)
> Hash Cond: ((t4.objid = q1_1.objid) AND (t4.objid = t5.objid))
> -> Bitmap Heap Scan on fscsubfile t4 (cost=154.42...14136.40 rows=5486 width=8)
> Recheck Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
> -> Bitmap Index Scan on ind_fscsubfile_filerespons (cost=0.00..153.05 rows=5486 width=0)
> Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
> -> Hash (cost=11917516.57..11917516.57 rows=55006045159 width=16)
> -> Nested Loop (cost=0.00..11917516.57 rows=55006045159 width=16)
> -> Seq Scan on atdateval t5 (cost=0.00...294152.40 rows=1859934 width=12)
> Filter: (attrid = 281479288456447::bigint)
> -> Index Scan using ind_ataggval on ataggval q1_1 (cost=0.00..6.20 rows=4 width=12)
> Index Cond: ((q1_1.attrid = 281479288456451::bigint) AND (q1_1.aggval = t5.aggrid))
> Filter: (q1_1.aggrid = 0)
> -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> (20 rows)
>
>
> As the disks pace is limited on my test system I can't provide the "explain analyze" output
> If I change the query as follows the query takes only 12 seconds and only needs 2 tmp files for sorting.
> (Changed lines are marked with [!!!!!] as I don't know HTML-Mails will be delivered without conversion
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t4.objid AND
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = q1_1.objid AND [!!!!!]
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=918320.29..971968.88 rows=1 width=8)
> -> Nested Loop (cost=918320.29..971968.88 rows=1 width=8)
> -> Merge Anti Join (cost=918320.29..971960.65 rows=1 width=8)
> Merge Cond: (t4.objid = q1_1.objid)
> -> Index Scan using ind_fscsubfile_filerespons on fscsubfile t4 (cost=0.00..19016.05 rows=5486 width=8)
> Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
> -> Materialize (cost=912418.42..956599.36 rows=22689 width=8)
> -> Merge Join (cost=912418.42..956372.47 rows=22689 width=8)
> Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
> -> Sort (cost=402024.80..406674.63 rows=1859934 width=12)
> Sort Key: t5.objid, t5.aggrid
> -> Bitmap Heap Scan on atdateval t5 (cost=43749.07..176555.24 rows=1859934 width=12)
> Recheck Cond: (attrid = 281479288456447::bigint)
> -> Bitmap Index Scan on ind_atdateval (cost=0.00..43284.08 rows=1859934 width=0)
> Index Cond: (attrid = 281479288456447::bigint)
> -> Materialize (cost=510392.25..531663.97 rows=1701738 width=12)
> -> Sort (cost=510392.25..514646.59 rows=1701738 width=12)
> Sort Key: q1_1.objid, q1_1.aggval
> -> Bitmap Heap Scan on ataggval q1_1 (cost=44666.00..305189.47 rows=1701738 width=12)
> Recheck Cond: (attrid = 281479288456451::bigint)
> Filter: (aggrid = 0)
> -> Bitmap Index Scan on ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0)
> Index Cond: (attrid = 281479288456451::bigint)
> -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> (26 rows)
>
> explain analyze SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t4.objid AND
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = q1_1.objid AND [!!!!!]
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.598..12083.048 rows=64 loops=1)
> -> Nested Loop (cost=918320.29..971968.88 rows=1 width=8) (actual time=12079.594..12083.010 rows=64 loops=1)
> -> Merge Anti Join (cost=918320.29..971960.65 rows=1 width=8) (actual time=12037.524..12081.989 rows=108 loops=1)
> Merge Cond: (t4.objid = q1_1.objid)
> -> Index Scan using ind_fscsubfile_filerespons on fscsubfile t4 (cost=0.00..19016.05 rows=5486 width=8) (actual time=0.073..83.498 rows=63436 loops=1)
> Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
> -> Materialize (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8866.253..11753.055 rows=1299685 loops=1)
> -> Merge Join (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8866.246..11413.397 rows=1299685 loops=1)
> Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
> -> Sort (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3133.362..3774.076 rows=1299685 loops=1)
> Sort Key: t5.objid, t5.aggrid
> Sort Method: external merge Disk: 47192kB
> -> Bitmap Heap Scan on atdateval t5 (cost=43749.07..176555.24 rows=1859934 width=12) (actual time=282.454..1079.038 rows=1857906 loops=1)
> Recheck Cond: (attrid = 281479288456447::bigint)
> -> Bitmap Index Scan on ind_atdateval (cost=0.00..43284.08 rows=1859934 width=0) (actual time=258.749...258.749 rows=1857906 loops=1)
> Index Cond: (attrid = 281479288456447::bigint)
> -> Materialize (cost=510392.25..531663.97 rows=1701738 width=12) (actual time=5732.872..6683.784 rows=1299685 loops=1)
> -> Sort (cost=510392.25..514646.59 rows=1701738 width=12) (actual time=5732.866..6387.188 rows=1299685 loops=1)
> Sort Key: q1_1.objid, q1_1.aggval
> Sort Method: external merge Disk: 39920kB
> -> Bitmap Heap Scan on ataggval q1_1 (cost=44666.00..305189.47 rows=1701738 width=12) (actual time=1644.983..3634.044 rows=1857906 loops=1)
> Recheck Cond: (attrid = 281479288456451::bigint)
> Filter: (aggrid = 0)
> -> Bitmap Index Scan on ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0) (actual time=1606.325..1606.325 rows=1877336 loops=1)
> Index Cond: (attrid = 281479288456451::bigint)
> -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8) (actual time=0.009..0.009 rows=1 loops=108)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> Total runtime: 12108.663 ms
> (29 rows)
>
>
> Another way to optimize my query is to change it as follows:
> (Once again changes are marked with [!!!!!]
>
> explain SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t5.objid AND [!!!!!]
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = t4.objid AND
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=916978.86..969139.72 rows=1 width=8)
> -> Nested Loop (cost=916978.86..969139.72 rows=1 width=8)
> -> Merge Anti Join (cost=916978.86..969131.49 rows=1 width=8)
> Merge Cond: (t4.objid = t5.objid)
> -> Index Scan using ind_fscsubfile_filerespons on fscsubfile t4 (cost=0.00..19016.05 rows=5486 width=8)
> Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
> -> Materialize (cost=912418.42..956599.36 rows=22689 width=8)
> -> Merge Join (cost=912418.42..956372.47 rows=22689 width=8)
> Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
> -> Sort (cost=402024.80..406674.63 rows=1859934 width=12)
> Sort Key: t5.objid, t5.aggrid
> -> Bitmap Heap Scan on atdateval t5 (cost=43749.07..176555.24 rows=1859934 width=12)
> Recheck Cond: (attrid = 281479288456447::bigint)
> -> Bitmap Index Scan on ind_atdateval (cost=0.00..43284.08 rows=1859934 width=0)
> Index Cond: (attrid = 281479288456447::bigint)
> -> Materialize (cost=510392.25..531663.97 rows=1701738 width=12)
> -> Sort (cost=510392.25..514646.59 rows=1701738 width=12)
> Sort Key: q1_1.objid, q1_1.aggval
> -> Bitmap Heap Scan on ataggval q1_1 (cost=44666.00..305189.47 rows=1701738 width=12)
> Recheck Cond: (attrid = 281479288456451::bigint)
> Filter: (aggrid = 0)
> -> Bitmap Index Scan on ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0)
> Index Cond: (attrid = 281479288456451::bigint)
> -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> (26 rows)
>
>
> explain analyze SELECT DISTINCT t4.objid
> FROM fscsubfile t4, cooobject t6
> WHERE t6.objid = t4.objid AND
> t4.fileresporgid = 573936067464397682 AND
> NOT EXISTS (
> SELECT 1
> FROM ataggval q1_1,
> atdateval t5
> WHERE q1_1.objid = t5.objid AND [!!!!!]
> q1_1.attrid = 281479288456451 AND
> q1_1.aggrid = 0 AND
> t5.aggrid = q1_1.aggval AND
> t5.objid = t4.objid AND
> t5.attrid = 281479288456447 ) AND
> ((t6.objclassid IN (285774255832590,285774255764301))) AND
> ((t4.objid > 573936097512390656 and t4.objid < 573936101807357952))
> ORDER BY t4.objid;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Unique (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.964..12106.409 rows=64 loops=1)
> -> Nested Loop (cost=916978.86..969139.72 rows=1 width=8) (actual time=12102.959..12106.375 rows=64 loops=1)
> -> Merge Anti Join (cost=916978.86..969131.49 rows=1 width=8) (actual time=12060.916..12105.374 rows=108 loops=1)
> Merge Cond: (t4.objid = t5.objid)
> -> Index Scan using ind_fscsubfile_filerespons on fscsubfile t4 (cost=0.00..19016.05 rows=5486 width=8) (actual time=0.080..81.397 rows=63436 loops=1)
> Index Cond: ((fileresporgid = 573936067464397682::bigint) AND (objid > 573936097512390656::bigint) AND (objid < 573936101807357952::bigint))
> -> Materialize (cost=912418.42..956599.36 rows=22689 width=8) (actual time=8874.492..11778.254 rows=1299685 loops=1)
> -> Merge Join (cost=912418.42..956372.47 rows=22689 width=8) (actual time=8874.484..11437.175 rows=1299685 loops=1)
> Merge Cond: ((t5.objid = q1_1.objid) AND (t5.aggrid = q1_1.aggval))
> -> Sort (cost=402024.80..406674.63 rows=1859934 width=12) (actual time=3117.555..3756.062 rows=1299685 loops=1)
> Sort Key: t5.objid, t5.aggrid
> Sort Method: external merge Disk: 39920kB
> -> Bitmap Heap Scan on atdateval t5 (cost=43749.07..176555.24 rows=1859934 width=12) (actual time=289.475..1079.624 rows=1857906 loops=1)
> Recheck Cond: (attrid = 281479288456447::bigint)
> -> Bitmap Index Scan on ind_atdateval (cost=0.00..43284.08 rows=1859934 width=0) (actual time=265.720...265.720 rows=1857906 loops=1)
> Index Cond: (attrid = 281479288456447::bigint)
> -> Materialize (cost=510392.25..531663.97 rows=1701738 width=12) (actual time=5756.915..6707.864 rows=1299685 loops=1)
> -> Sort (cost=510392.25..514646.59 rows=1701738 width=12) (actual time=5756.909..6409.819 rows=1299685 loops=1)
> Sort Key: q1_1.objid, q1_1.aggval
> Sort Method: external merge Disk: 39920kB
> -> Bitmap Heap Scan on ataggval q1_1 (cost=44666.00..305189.47 rows=1701738 width=12) (actual time=1646.955..3628.918 rows=1857906 loops=1)
> Recheck Cond: (attrid = 281479288456451::bigint)
> Filter: (aggrid = 0)
> -> Bitmap Index Scan on ind_ataggval (cost=0.00..44240.56 rows=1860698 width=0) (actual time=1608.233..1608.233 rows=1877336 loops=1)
> Index Cond: (attrid = 281479288456451::bigint)
> -> Index Scan using cooobjectix on cooobject t6 (cost=0.00..8.22 rows=1 width=8) (actual time=0.008..0.009 rows=1 loops=108)
> Index Cond: (t6.objid = t4.objid)
> Filter: (t6.objclassid = ANY ('{285774255832590,285774255764301}'::bigint[]))
> Total runtime: 12129.613 ms
> (29 rows)
>
>
>
> As the query performs in roughly 12 seconds in both (changed) cases you might advise to change my queries :-)
> (In fact we are working on this)
> As the primary performance impact can also be reproduced in a small database (querytime > 1 minute) I checked this issue on MS-SQL server and Oracle. On MSSQL server there is no difference in the execution plan if you change the query an the performance is well. Oralce shows a slightly difference but the performance is also well.
> As I mentioned we are looking forward to change our query but in my opinion there could be a general performance gain if this issue is addressed. (especially if you don't know you run into this issue on the query performance is sufficient enough)
>
> greets
> Armin
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Carey | 2010-07-20 22:58:07 | Re: IDE x SAS RAID 0 on HP DL 380 G5 P400i controller performance problem |
Previous Message | Kneringer, Armin | 2010-07-20 16:25:55 | potential performance gain by query planner optimization |