From: | Jann Röder <roederja(at)ethz(dot)ch> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Inefficient query plan |
Date: | 2010-08-23 04:05:52 |
Message-ID: | i4ss30$bq1$1@dough.gmane.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
I have two tables:
A: ItemID (PK), IsssueID (Indexed)
B: ItemID (FK), IndexNumber : PK(ItemID, IndexNumber)
Both tables have several million columns, but B has much more than A.
Now if I run
SELECT A.ItemID FROM A, B WHERE A.ItemID = B.itemID AND A.issueID =
<some id>
The query takes extremely long (several hours). I ran EXPLAIN and got:
"Hash Join (cost=516.66..17710110.47 rows=8358225 width=16)"
" Hash Cond: ((b.itemid)::bpchar = a.itemid)"
" -> Seq Scan on b (cost=0.00..15110856.68 rows=670707968 width=16)"
" -> Hash (cost=504.12..504.12 rows=1003 width=16)"
" -> Index Scan using idx_issueid on a (cost=0.00..504.12
rows=1003 width=16)"
" Index Cond: (issueid = 'A1983PW823'::bpchar)"
Now we see the problem is the seq scan on B. However there are only a
handful of rows in A that match a specific issueID. So my question is
why doesn't it just check for each of the ItemIDs that have the correct
IssueID in A if there is a matching itemID in B. This should be really
fast because ItemID in B is indexed since it is part of the primary key.
What is the reason for postgres not doing this, is there a way I can
make it do that? I'm using postgresql 8.4.4 and yes, I did run ANALYZE
on the entire DB.
Thanks a lot,
Jann
From | Date | Subject | |
---|---|---|---|
Next Message | Jann Röder | 2010-08-23 10:58:06 | Re: Inefficient query plan |
Previous Message | Oliveiros d'Azevedo Cristina | 2010-08-19 13:24:48 | Re: Extract created and last modified data |