Re: Huge Performance Difference on Similar Query in Pg7.2

From: Heiko Klein <Heiko(dot)Klein(at)met(dot)no>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Huge Performance Difference on Similar Query in Pg7.2
Date: 2002-03-22 16:16:49
Message-ID: 15515.22769.371394.645257@polar.oslo.dnmi.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane writes:
> Unfortunately the planner is not very bright about transitivity of
> equals and so does not realize that it could derive EmissionsView.setid
> = '4614' from the other two conditions. I imagine the reason Oracle
> gives similar behavior for both queries is that it does expend the
> cycles to make that deduction. (This is a guess though; it'd be
> interesting to see their query plans.)

I don't know how to retrieve the query plan from oracle7. Any hints and
I will send them

>
> Given the complexity of the plans, it may just be that the planner is
> switching into GEQO search mode and is failing to find the best plan.
> You might consider setting geqo_threshold larger than the number of
> base tables in the query (looks like 12, here) to see if better plans
> emerge.

Wow, thats impressive. By changing the geqo_threshold from 11 to 12, I
reduced time from 70000ms to 700ms. Increasing step by step it stay
constant until 15, then I have 70000ms at 16 and 17, and drop to 200ms
at 18 and stay between 100 and 150 when I increase it further. So why
not using a very high geqo_threshold (maybe 100) as threshold? (See
new query-plan below with geqo_threshold = 20)

> It'd also be worth asking whether you couldn't simplify your views ;-).
> Do you really require a 12-way join to do whatever it is you're doing?
> I'm pretty amazed that the system is able to find a good plan for either
> query ...

Exactly that was the problems, all queries where quite slow. This one
was only one I used very often. But I cannot do everything at once. I
just convinced my boss to switch from Oracle to Postgres, and that this
wouldn't take a long time. So the frontend is important for the views
(it asks for them), and I may not change this before I haven't managed to
converert to Postgres smoothly. Short: its politics

But thanks a lot for your help, that was really usefull and a huge step
forward for me.

Heiko

emep=> explain analyze select count(*) from LowHiSectorGridEmissionsView , DataSetsView where DataSetsView.setid = '4614' and DataSetsView.setid = LowHiSectorGridEmissionsView.setid ;
NOTICE: QUERY PLAN:

Aggregate (cost=1992.56..1992.56 rows=1 width=131) (actual time=112.80..112.80 rows=1 loops=1)
-> Nested Loop (cost=1989.38..1992.55 rows=4 width=131) (actual time=105.12..111.02 rows=1606 loops=1)
-> Seq Scan on footnotes (cost=0.00..1.01 rows=1 width=4) (actual time=0.03..0.04 rows=1 loops=1)
-> Materialize (cost=1991.49..1991.49 rows=4 width=127) (actual time=105.06..106.01 rows=1606 loops=1)
-> Hash Join (cost=1989.38..1991.49 rows=4 width=127) (actual time=75.34..102.23 rows=1606 loops=1)
-> Seq Scan on areas (cost=0.00..1.71 rows=71 width=4) (actual time=0.02..0.19 rows=71 loops=1)
-> Hash (cost=1989.37..1989.37 rows=4 width=123) (actual time=75.15..75.15 rows=0 loops=1)
-> Hash Join (cost=1984.27..1989.37 rows=4 width=123) (actual time=42.51..71.05 rows=1606 loops=1)
-> Seq Scan on sectordefinitions (cost=0.00..3.09 rows=109 width=22) (actual time=0.01..0.35 rows=109 loops=1)
-> Hash (cost=1984.20..1984.20 rows=26 width=101) (actual time=30.29..30.29 rows=0 loops=1)
-> Nested Loop (cost=22.01..1984.20 rows=26 width=101) (actual time=1.02..25.55 rows=1606 loops=1)
-> Hash Join (cost=22.01..24.08 rows=1 width=89) (actual time=0.97..1.19 rows=1 loops=1)
-> Seq Scan on areas (cost=0.00..1.71 rows=71 width=4) (actual time=0.01..0.16 rows=71 loops=1)
-> Hash (cost=22.00..22.00 rows=1 width=85) (actual time=0.87..0.87 rows=0 loops=1)
-> Nested Loop (cost=16.07..22.00 rows=1 width=85) (actual time=0.81..0.87 rows=1 loops=1)
-> Hash Join (cost=16.07..17.35 rows=1 width=77) (actual time=0.78..0.84 rows=1 loops=1)
-> Seq Scan on datatypes (cost=0.00..1.18 rows=18 width=27) (actual time=0.01..0.06 rows=18 loops=1)
-> Hash (cost=16.07..16.07 rows=1 width=50) (actual time=0.69..0.69 rows=0 loops=1)
-> Nested Loop (cost=8.02..16.07 rows=1 width=50) (actual time=0.31..0.68 rows=1 loops=1)
-> Nested Loop (cost=8.02..12.69 rows=1 width=28) (actual time=0.28..0.65 rows=1 loops=1)
-> Hash Join (cost=8.02..10.09 rows=1 width=24) (actual time=0.21..0.39 rows=1 loops=1)
-> Seq Scan on areas (cost=0.00..1.71 rows=71 width=4) (actual time=0.01..0.14 rows=71 loops=1)
-> Hash (cost=8.01..8.01 rows=1 width=20) (actual time=0.11..0.11 rows=0 loops=1)
-> Nested Loop (cost=0.00..8.01 rows=1 width=20) (actual time=0.10..0.11 rows=1 loops=1)
-> Index Scan using datasets_setid_idx on datasets (cost=0.00..3.36 rows=1 width=12) (actual time=0.06..0.06 rows=1 loops=1)
-> Index Scan using reports_pkey on reports (cost=0.00..4.64 rows=1 width=8) (actual time=0.03..0.03 rows=1 loops=1)
-> Seq Scan on areas (cost=0.00..1.71 rows=71 width=4) (actual time=0.00..0.16 rows=71 loops=1)
-> Index Scan using datasets_setid_idx on datasets (cost=0.00..3.36 rows=1 width=22) (actual time=0.02..0.02 rows=1 loops=1)
-> Index Scan using reports_pkey on reports (cost=0.00..4.64 rows=1 width=8) (actual time=0.01..0.01 rows=1 loops=1)
-> Index Scan using lhsecgriemis_setid_idx on lowhisectorgridemissions (cost=0.00..1953.36 rows=541 width=12) (actual time=0.04..7.30 rows=1606 loops=1)
Total runtime: 114.50 msec

EXPLAIN

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Stephan Szabo 2002-03-22 16:35:24 Re: Huge Performance Difference on Similar Query in Pg7.2
Previous Message Jochem van Dieten 2002-03-22 16:02:07 Re: Newbie question - Which Linux?