Re: Highly obscure and erratic

From: Varun Kacholia <varunk(at)cse(dot)iitb(dot)ac(dot)in>
To: Shaun Thomas <sthomas(at)townnews(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Highly obscure and erratic
Date: 2002-06-20 21:56:23
Message-ID: 20020621032900.A15070@voxel.cse.iitb.ac.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

hi,

> SELECT *
> FROM dbmedia d, wdmedia w
> WHERE w.word = 'word1' AND d.id=w.id
> INTERSECT
> SELECT *
> FROM dbmedia d, wdmedia w
> WHERE w.word = 'word2' AND d.id=w.id
> INTERSECT
> ...

well this does not help...
it results in a seq scan only.

suryadb=# explain (select * from dbmedia d,wdmedia w where w.word='whatever' and d.id=w.id) INTERSECT (select * from dbmedia d,wdmedia w where w.word='whatever' and d.id=w.id) ;
NOTICE: QUERY PLAN:

SetOp Intersect (cost=25519.49..25674.00 rows=515 width=92)
-> Sort (cost=25519.49..25519.49 rows=5151 width=92)
-> Append (cost=5033.63..25201.94 rows=5151 width=92)
-> Subquery Scan *SELECT* 1 (cost=5033.63..12600.97 rows=2575 width=92)
-> Hash Join (cost=5033.63..12600.97 rows=2575 width=92)
-> Seq Scan on dbmedia d (cost=0.00..3499.83 rows=100883 width=76)
^^^^^^^^^^
-> Hash (cost=5027.19..5027.19 rows=2575 width=16)
-> Index Scan using wdkmedia on wdmedia w (cost=0.00..5027.19 rows=2575 width=16)
-> Subquery Scan *SELECT* 2 (cost=5033.63..12600.97 rows=2575 width=92)
-> Hash Join (cost=5033.63..12600.97 rows=2575 width=92)
-> Seq Scan on dbmedia d (cost=0.00..3499.83 rows=100883 width=76)
^^^^^^^^^^^^^^^^^^^^^^
-> Hash (cost=5027.19..5027.19 rows=2575 width=16)
-> Index Scan using wdkmedia on wdmedia w (cost=0.00..5027.19 rows=2575 width=16)

instead the self-join is better...atleast in the case of postgresql :p
anyways..
thanks for the reply!

> I think you get the picture from this. If he didn't need INTERSECT to
> emulate AND for his word matches, he could just use OR, and do this:
>
> SELECT *
> FROM dbmedia d, wdmedia w
> WHERE w.word IN ('word1', 'word2', 'etc') AND d.id=w.id;
>
> My guess is the OR version is faster, but he wants to restrict the data,
> not add to it. Unfortunate. ^_^
>
> > and avoid IN, INTERSECT and UNION at all costs.
>
> Actually, if INTERSECT and UNION are done properly (which I'm guessing
> they are) at most, the query has to be run once for each distinct query
> between the INTERSECT/UNION clauses. On a fast/small query, an upper
> bound of doubling execution time isn't too bad. Mathematical matrix
> intersections and unions are pretty fast/easy, so that doesn't add much
> overhead at all.
>
> It's IN you have to avoid. The way postgres does it is just insane, and
> you'll suffer greatly upon using it.
>
> --
> +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
> | Shaun M. Thomas INN Database Administrator |
> | Phone: (309) 743-0812 Fax : (309) 743-0830 |
> | Email: sthomas(at)townnews(dot)com AIM : trifthen |
> | Web : www.townnews.com |
> | |
> | "Most of our lives are about proving something, either to |
> | ourselves or to someone else." |
> | -- Anonymous |
> +-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
>
>

--
------
Varun
Why do the people who know the least know it the loudest?

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Stephan Szabo 2002-06-20 22:03:41 Re: Yet another "Why won't PostgreSQL use my index?"
Previous Message David Link 2002-06-20 21:46:16 help on speeding up a one table query