| From: | Chris Mungall <cjm(at)fruitfly(dot)org> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | <pgsql-admin(at)postgresql(dot)org> | 
| Subject: | Re: abnormally long time in performing a two-table join | 
| Date: | 2002-08-11 23:09:30 | 
| Message-ID: | Pine.LNX.4.33.0208111604120.16003-100000@sos.lbl.gov | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-admin | 
On Sun, 11 Aug 2002, Tom Lane wrote:
> Chris Mungall <cjm(at)fruitfly(dot)org> writes:
> > Nested Loop  (cost=0.00..197011.39 rows=223 width=8) (actual time=16744.92..44572.00 rows=15 loops=1)
> >   ->  Index Scan using seqfeature_pkey on seqfeature  (cost=0.00..61715.62 rows=44674 width=4) (actual time=0.29..14669.06 rows=100030 loops=1)
> >   ->  Index Scan using sfqv_idx1 on sfqv  (cost=0.00..3.02 rows=1 width=4) (actual time=0.29..0.29 rows=0 loops=100030)
>
> Odd ... I'm surprised it doesn't choose a hash join.  What do you get if
> you try it with "set enable_nestloop = off" ?
Much better!
omicia29=# set enable_nestloop = off;
SET VARIABLE
omicia29=# explain analyze select seqfeature_id from seqfeature NATURAL
JOIN sfqv where qualifier_value = 'BRCA1' and seqfeature.seqfeature_key_id
= 15;
NOTICE:  QUERY PLAN:
Hash Join  (cost=55921.31..223860.67 rows=778 width=8) (actual
time=4249.63..4259.47 rows=15 loops=1)
  ->  Index Scan using sfqv_idx3 on sfqv  (cost=0.00..167411.84 rows=41423
width=4) (actual time=38.05..44.50 rows=110 loops=1)
  ->  Hash  (cost=51056.43..51056.43 rows=49453 width=4) (actual
time=4211.15..4211.15 rows=0 loops=1)
        ->  Seq Scan on seqfeature  (cost=0.00..51056.43 rows=49453
width=4) (actual time=0.14..3974.12 rows=100030 loops=1)
Total runtime: 4259.67 msec
EXPLAIN
omicia29=# set enable_nestloop = on;
SET VARIABLE
omicia29=# explain analyze select seqfeature_id from seqfeature NATURAL
JOIN sfqv where qualifier_value = 'BRCA1' and seqfeature.seqfeature_key_id
= 15;
NOTICE:  QUERY PLAN:
Nested Loop  (cost=0.00..203740.72 rows=778 width=8) (actual
time=8382.36..15421.84 rows=15 loops=1)
  ->  Seq Scan on seqfeature  (cost=0.00..51056.43 rows=49453 width=4)
(actual time=0.14..4198.16 rows=100030 loops=1)
  ->  Index Scan using sfqv_idx1 on sfqv  (cost=0.00..3.07 rows=1 width=4)
(actual time=0.11..0.11 rows=0 loops=100030)
Total runtime: 15422.03 msec
however I'm not sure what the implications of turning nestloop off
altogether are - maybe i can hardcode it just for this query
> > CREATE INDEX sfqv_idx4 ON sfqv USING btree (seqfeature_id, qualifier_value);
>
> > I would have thought sfqv_idx4 would be useful in this particular query?
>
> You'd have to write
>
> select ... where qualifier_value = 'BRCA1' and
> seqfeature.seqfeature_key_id = 15 and
> sfqv.seqfeature_key_id = 15
>
> to get it to consider that index.  You know and I know that the join
> should imply sfqv.seqfeature_key_id = 15, but the planner doesn't.
sorry, my column names are confusing
the primary key of the seqfeature table is "seqfeature_id",
seqfeature_key_id is actually a FK into another table entirely
(ideally this would be a 3 table join, but the 3rd table is small so I
prefetch all the pkeys in my code which speeds things up)
> 			regards, tom lane
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2002-08-11 23:48:27 | Re: abnormally long time in performing a two-table join | 
| Previous Message | Tom Lane | 2002-08-11 21:41:35 | Re: abnormally long time in performing a two-table join |