Re: Query plan changes after pg_dump / pg_restore

From: Jona <jonanews(at)oismail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query plan changes after pg_dump / pg_restore
Date: 2005-06-09 16:15:04
Message-ID: 42A86B08.4070005@oismail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Tom,
Thank you for the input, you're absolutely right.
Have just executed like 10 VACUUM ANALYZE on the Price_Tbl in both
databases and now both queries use the same plan.... the bad one, GREAT!
Who said ignorance is bliss?? ;-)

Have just messed around with ALTER TABLE ... ALTER .... SET STATISTICS
.... for both tables to no effect.
Have tried setting both high number (100 and 200) and a low number (1)
and run several VACUUM ANALYZE afterwards.
It still insists on the bad plan...

Furthermore I've played around with the RANDOM_PAGE_COST runtime parameter.
Seems that when I set it to 2.2 it switch to using the aff_price_uq
index on Price_Tbl, however it needs to be set to 0.7 before it uses
the subcat_uq index on SCT2SubCatType_Tbl.
Has no effect wether the statistics is set to 1 or a 100 for this behaviour.
The overall plan remains the same though, and even when it uses both
indexes the total cost is roughly 5.5 times higher than the good plan.

New plan:
Unique (cost=612.29..612.65 rows=3 width=75) (actual
time=255.88..255.89 rows=3 loops=1)
-> Hash Join (cost=158.26..596.22 rows=288 width=75) (actual
time=60.91..99.69 rows=2477 loops=1)
Hash Cond: ("outer".sctid = "inner".sctid)
-> Index Scan using aff_price_uq on price_tbl
(cost=0.00..409.24 rows=5025 width=4) (actual time=0.03..17.81 rows=5157
loops=1)
Index Cond: (affid = 8)
-> Hash (cost=157.37..157.37 rows=355 width=71)
(actual time=60.77..60.77 rows=0 loops=1)
-> Merge Join (cost=10.26..157.37 rows=355
width=71) (actual time=14.42..53.79 rows=2493 loops=1)
Merge Cond: ("outer".subcattpid =
"inner".id)
-> Index Scan using subcat_uq on
sct2subcattype_tbl (cost=0.00..126.28 rows=6536 width=8) (actual
time=0.03..23.25 rows=6527 loops=1)
-> Sort (cost=10.26..10.28 rows=9
width=63) (actual time=2.46..5.66 rows=2507 loops=1)

"Total runtime: 257.49 msec"

Old "good" plan:
Unique (cost=117.18..117.20 rows=1 width=147) (actual
time=224.62..224.63 rows=3 loops=1)
-> Index Scan using subcat_uq on sct2subcattype_tbl
(cost=0.00..100.47 rows=33 width=8) (actual time=0.01..0.20 rows=46
loops=54)
Index Cond: ("outer".id = sct2subcattype_tbl.subcattpid)

-> Index Scan using aff_price_uq on price_tbl
(cost=0.00..7.11 rows=1 width=4) (actual time=0.01..0.01 rows=1
loops=2493)
Index Cond: ((price_tbl.affid = 8) AND ("outer".sctid =
price_tbl.sctid))

Total runtime: 225.14 msec

It seems that the more it knows about

Could you provide some input on how to make it realise that the plan it
selects is not the optimal?

Cheers
Jona

Tom Lane wrote:

>Jona <jonanews(at)oismail(dot)com> writes:
>
>
>>What worries me is that the plan is different,
>>
>>
>
>Given that the estimated costs are close to the same, this is probably
>just the result of small differences in the ANALYZE statistics leading
>to small differences in cost estimates and thus choice of different
>plans. I'll bet if you re-ANALYZE a few times on the source database
>you'll see it flipping between plan choices too. This is normal because
>ANALYZE takes a random sample of rows rather than being exhaustive.
>
>So the interesting question is not "why are the plan choices different"
>it is "how do I get the cost estimates closer to reality". That's the
>only way in the long run to ensure the planner makes the right choice.
>Increasing the statistics targets or fooling with planner cost
>parameters are the basic tools you have available here.
>
> regards, tom lane
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Alex Stapleton 2005-06-09 16:37:25 random_page_cost = 1?
Previous Message Christopher Kings-Lynne 2005-06-09 15:01:33 Re: Recommendations for configuring a 200 GB