Re: Help optimize view

From: "Relyea, Mike" <Mike(dot)Relyea(at)xerox(dot)com>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, <pgsql-performance(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Help optimize view
Date: 2007-08-13 21:00:28
Message-ID: 1806D1F73FCB7F439F2C842EE0627B18065F7A86@USA0300MS01.na.xerox.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> >>> On Mon, Aug 13, 2007 at 1:48 PM, in message
> <1806D1F73FCB7F439F2C842EE0627B18065F78DF(at)USA0300MS01(dot)na(dot)xerox
> .net>, "Relyea, Mike" <Mike(dot)Relyea(at)xerox(dot)com> wrote:
> > I've increased shared_buffers to 128MB, and restarted the
> server. My
> > total run time didn't really change.
>
> Please forgive me if this guess doesn't help either, but
> could you try eliminating the GROUP BY options which don't
> echo values in the select value list, and move the HAVING
> conditions to a WHERE clause? Something like:
>
> explain analyze
> SELECT
> "PrintSamples"."MachineID",
> "PrintSamples"."PrintCopyID",
> "tblColors"."ColorID",
> avg("ParameterValues"."ParameterValue") AS "Mottle_NMF"
> FROM "AnalysisModules"
> JOIN
> (
> "tblColors"
> JOIN
> (
> "tblTPNamesAndColors"
> JOIN "PrintSamples"
> ON ("tblTPNamesAndColors"."TestPatternName"::text =
> "PrintSamples"."TestPatternName"::text)
> JOIN
> (
> "DigitalImages"
> JOIN "PrintSampleAnalyses"
> ON ("DigitalImages"."ImageID" =
> "PrintSampleAnalyses"."ImageID")
> JOIN
> (
> "ParameterNames"
> JOIN
> (
> "Measurements"
> JOIN "ParameterValues"
> ON "Measurements"."MeasurementID" =
> "ParameterValues"."MeasurementID"
> ) ON "ParameterNames"."ParameterID" =
> "ParameterValues"."ParameterID"
> ) ON "PrintSampleAnalyses"."psaID" = "Measurements"."psaID"
> ) ON "PrintSamples"."PrintSampleID" =
> "DigitalImages"."PrintSampleID"
> ) ON "tblColors"."ColorID" = "tblTPNamesAndColors"."ColorID"
> ) ON "AnalysisModules"."MetricID" = "Measurements"."MetricID"
> WHERE "AnalysisModules"."AnalysisModuleName"::text = 'NMF'::text
> AND "ParameterNames"."ParameterName"::text = 'NMF'::text
> AND "PrintSamples"."TestPatternName"::text ~~ 'IQAF-TP8%'::text
> AND "tblColors"."ColorID" <> 3
> GROUP BY
> "PrintSamples"."MachineID",
> "PrintSamples"."PrintCopyID",
> "tblColors"."ColorID"
> ;
>
> I'd also be inclined to simplify the FROM clause by
> eliminating the parentheses and putting the ON conditions
> closer to where they are used, but that would be more for
> readability than any expectation that it would affect the plan.
>
> -Kevin

Thanks for your help. Re-writing the view like this maybe bought me
something. I've pasted the explain analyze results below. Tough to
tell because I also increased some of the statistics. From what Tom
says, it sounds like if I want the data returned faster I'm likely to
have to get beefier hardware.

ALTER TABLE "ParameterValues" ALTER "MeasurementID" SET STATISTICS 500;

ALTER TABLE "ParameterValues" ALTER "ParameterID" SET STATISTICS 500;

ANALYZE "ParameterValues";

ALTER TABLE "Measurements" ALTER COLUMN "MetricID" SET STATISTICS 500;

ALTER TABLE "Measurements" ALTER COLUMN "psaID" SET STATISTICS 500;

ANALYZE "Measurements";

Running the above SQL:

HashAggregate (cost=461541.53..461634.88 rows=7468 width=16) (actual
time=110002.041..110024.777 rows=14853 loops=1)
-> Hash Join (cost=230789.57..461464.70 rows=7683 width=16) (actual
time=56847.814..109936.722 rows=15123 loops=1)
Hash Cond: ("Measurements"."MetricID" =
"AnalysisModules"."MetricID")
-> Hash Join (cost=230787.80..461057.64 rows=87588 width=20) (actual
time=56847.697..109884.122 rows=15123 loops=1)
Hash Cond: ("ParameterValues"."MeasurementID" =
"Measurements"."MeasurementID")
-> Nested Loop (cost=6353.15..234044.47 rows=454038 width=8)
(actual time=179.154..52780.680 rows=289724 loops=1)
-> Seq Scan on "ParameterNames" (cost=0.00..1.94 rows=1
width=4) (actual time=0.012..0.027 rows=1 loops=1)
Filter: (("ParameterName")::text = 'NMF'::text)
-> Bitmap Heap Scan on "ParameterValues"
(cost=6353.15..228047.32 rows=479617 width=12) (actual
time=179.123..52102.572 rows=289724 loops=1)
Recheck Cond: ("ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID")
-> Bitmap Index Scan on "PVParameterID_idx"
(cost=0.00..6233.25 rows=479617 width=0) (actual time=152.752..152.752
rows=289724 loops=1)
Index Cond: ("ParameterNames"."ParameterID" =
"ParameterValues"."ParameterID")
-> Hash (cost=206253.42..206253.42 rows=1454498 width=20) (actual
time=56657.022..56657.022 rows=961097 loops=1)
-> Nested Loop (cost=5069.24..206253.42 rows=1454498 width=20)
(actual time=932.249..55176.315 rows=961097 loops=1)
-> Hash Join (cost=5069.24..7949.67 rows=15206 width=16)
(actual time=908.275..1257.120 rows=18901 loops=1)
Hash Cond: ("PrintSampleAnalyses"."ImageID" =
"DigitalImages"."ImageID")
-> Seq Scan on "PrintSampleAnalyses" (cost=0.00..2334.25
rows=78825 width=8) (actual time=10.440..139.945 rows=78859 loops=1)
-> Hash (cost=4879.10..4879.10 rows=15211 width=16)
(actual time=897.776..897.776 rows=18901 loops=1)
-> Hash Join (cost=2220.11..4879.10 rows=15211
width=16) (actual time=297.330..868.632 rows=18901 loops=1)
Hash Cond: ("DigitalImages"."PrintSampleID" =
"PrintSamples"."PrintSampleID")
-> Seq Scan on "DigitalImages" (cost=0.00..1915.50
rows=78850 width=8) (actual time=15.859..408.784 rows=78859 loops=1)
-> Hash (cost=2029.98..2029.98 rows=15211
width=16) (actual time=281.413..281.413 rows=18645 loops=1)
-> Hash Join (cost=564.39..2029.98 rows=15211
width=16) (actual time=84.182..251.833 rows=18645 loops=1)
Hash Cond:
(("PrintSamples"."TestPatternName")::text =
("tblTPNamesAndColors"."TestPatternName")::text)
-> Bitmap Heap Scan on "PrintSamples"
(cost=561.39..1781.53 rows=24891 width=40) (actual time=83.925..184.775
rows=24914 loops=1)
Filter: (("TestPatternName")::text ~~
'IQAF-TP8%'::text)
-> Bitmap Index Scan on
"PSTestPatternName_idx" (cost=0.00..555.17 rows=24891 width=0) (actual
time=74.198..74.198 rows=24914 loops=1)
Index Cond:
((("TestPatternName")::text >= 'IQAF-TP8'::character varying) AND
(("TestPatternName")::text < 'IQAF-TP9'::character varying))
-> Hash (cost=2.72..2.72 rows=22 width=30)
(actual time=0.225..0.225 rows=21 loops=1)
-> Hash Join (cost=1.09..2.72 rows=22
width=30) (actual time=0.086..0.184 rows=21 loops=1)
Hash Cond:
("tblTPNamesAndColors"."ColorID" = "tblColors"."ColorID")
-> Seq Scan on "tblTPNamesAndColors"
(cost=0.00..1.30 rows=30 width=30) (actual time=0.025..0.060 rows=30
loops=1)
-> Hash (cost=1.05..1.05 rows=3
width=4) (actual time=0.040..0.040 rows=3 loops=1)
-> Seq Scan on "tblColors"
(cost=0.00..1.05 rows=3 width=4) (actual time=0.021..0.027 rows=3
loops=1)
Filter: ("ColorID" <> 3)
-> Index Scan using "MpsaID_idx" on "Measurements"
(cost=0.00..11.13 rows=153 width=12) (actual time=1.615..2.728 rows=51
loops=18901)
Index Cond: ("PrintSampleAnalyses"."psaID" =
"Measurements"."psaID")
-> Hash (cost=1.71..1.71 rows=5 width=4) (actual time=0.092..0.092
rows=5 loops=1)
-> Seq Scan on "AnalysisModules" (cost=0.00..1.71 rows=5 width=4)
(actual time=0.060..0.077 rows=5 loops=1)
Filter: (("AnalysisModuleName")::text = 'NMF'::text)
Total runtime: 110047.601 ms

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-08-13 21:01:40 Re: Stable function optimisation
Previous Message Philipp Specht 2007-08-13 20:37:59 Stable function optimisation