From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robin Houston <robin(dot)houston(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Huge difference in query performance between 8.3 and 8.4 (possibly) |
Date: | 2009-07-09 17:09:11 |
Message-ID: | 2712.1247159351@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Robin Houston <robin(dot)houston(at)gmail(dot)com> writes:
> We have a query that runs very slowly on our 8.3 database. (I can't
> tell you exactly how slowly, because it has never successfully run to
> completion even when we left it running overnight.) On the 8.4
> database on my laptop, it runs in about 90 seconds. Of course there
> are several differences between the two instances, but I wonder
> whether query planning improvements in 8.4 could essentially account
> for it.
Well, it's hard to be sure with only EXPLAIN and not EXPLAIN ANALYZE
output to look at; but I think the significant difference in these plans
is that 8.4 has chosen a hash instead of nestloop join for a couple of
the intermediate join levels. Which is evidently because of a change
in the estimated size of the next join down:
-> Nested Loop (cost=44050.86..77114.32 rows=1 width=50)
Join Filter: ((second.feature_cvterm_id > first.feature_cvterm_id) AND (second.feature_id = first.feature_id) AND (second.pub_id = first.pub_id) AND ((second_withfrom_dbxref.accession)::text = (first_withfrom_dbxref.accession)::text))
-> Nested Loop (cost=30794.26..42915.70 rows=1 width=69)
-> Hash Join (cost=30794.26..42906.88 rows=1 width=65)
Hash Cond: (second_evidence.type_id = evidence_type.cvterm_id)
versus
-> Hash Join (cost=63949.73..77732.49 rows=1 width=59)
Hash Cond: ((second.feature_id = first.feature_id) AND (second.pub_id = first.pub_id) AND ((second_withfrom_dbxref.accession)::text = (first_withfrom_dbxref.accession)::text))
Join Filter: (second.feature_cvterm_id > first.feature_cvterm_id)
-> Hash Join (cost=30236.57..41303.13 rows=4607 width=66)
Hash Cond: (second_evidence.type_id = evidence_type.cvterm_id)
If the 8.4 rowcount estimate is accurate then it's not surprising that
the nestloop plan sucks --- it'd be re-executing the other arm of the
join 4600 or so times.
This could reflect improvements in the join size estimation code, or
maybe it's just a consequence of 8.4 using larger statistics targets
by default. It's hard to be sure with so little information to go on.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Hartman, Matthew | 2009-07-09 17:36:00 | Re: Huge difference in query performance between 8.3 and 8.4 (possibly) |
Previous Message | Alvaro Herrera | 2009-07-09 17:02:31 | Re: Huge difference in query performance between 8.3 and 8.4 (possibly) |