Re: [SQL] Subselect performance

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: Daniel Lopez <ridruejo(at)atm9(dot)com(dot)dtu(dot)dk>
Cc: rison(at)biochemistry(dot)ucl(dot)ac(dot)uk, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-sql(at)hub(dot)org
Subject: Re: [SQL] Subselect performance
Date: 1999-09-28 21:14:06
Message-ID: 199909282114.RAA04792@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

>
> > It is not your fault. We frequently get reports of this type, and the
> > behavior of the subquery is very non-intuitive. You would think that a
> > subquery and a join would have the same performance, but because of the
> > limitation of subqueries as being nested loop joined, this is not the
> > case, and subqueries are slower. We tell people to rewrite their query
> > as EXISTS, but by the time we tell them that, they have already spent
> > much time trying to figure out why the query is so slow, and I am sure
> > many people don't even know about the EXISTS workaround.
>
> You are right: I spend some time scratching my head, then some time
> searching the mailing lists and I finally made the query with a EXISTS,
> which works great for me :) Thanks
> Can this be a candidate to include in the FAQ?
>
> On the same idea, is there any good document out there with all the SQL
> "recipes" or common practice for things like : "Give me all the rows which
> have this value in this column more than once, etc"
> I do it with:
> select my_index, count(my_index) from my_table group by my_index having
> count(my_index) > 1;
>
> But this is a common query and would be interested in knowing which is the
> commonly accepted way of doing this

New FAQ:

4.23) Why are my subqueries using IN so slow?

Currently, we join subqueries to outer queries by sequential scanning
the result of the subquery for each row of the outer query. A workaround
is to replace IN with EXISTS. For example, change:

SELECT *
FROM tab
WHERE col1 IN (SELECT col2 FROM TAB2)

to:

SELECT *
FROM tab
WHERE EXISTS (SELECT col2 FROM TAB2 WHERE col1 = col2)

We hope to fix this limitation in a future release.

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Clayton Cottingham 1999-09-29 03:18:03 new irc channel
Previous Message Saltsgaver, Scott 1999-09-28 20:40:43 Reporting errors when a rule fails.