Re: [SQL] Sub-select speed.

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Mitch Vincent <mitch(at)venux(dot)net>, pgsql-sql(at)postgreSQL(dot)org
Subject: Re: [SQL] Sub-select speed.
Date: 1999-12-17 22:05:33
Message-ID: 199912172205.RAA04860@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

> "Mitch Vincent" <mitch(at)venux(dot)net> writes:
> > select * from applicants as a where a.status = 'A' and a.app_id in(select
> > b.app_id from resume_search as b where a.app_id=b.app_id and b.user_id=291)
>
> WHERE ... IN (sub-select) is a very inefficient substitute for a plain join
> query. Try something like
>
> select a.* from applicants as a, resume_search as b
> where a.status = 'A' and a.app_id = b.app_id and b.user_id = 291;
>
> Providing indexes on one or both app_id fields might help. Also, don't
> forget to do a VACUUM every so often to make sure the planner has useful
> statistics about the sizes of the tables.
>
> There has been some talk of automatically rewriting queries to eliminate
> unnecessary sub-selects, but I don't foresee it getting done for a
> while yet.

Tom, you mentioned that subselects use nested join, but they could be
hardcoded to use hash join. My opinion is that this should be done if
it is easy. I know the concern was that a hash could overflow if the
subquery is too larger, but if the subquery is too large, nested join
will take forever, so it really doesn't matter if it completes or not.

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message kaiq 1999-12-17 22:44:34 avg() on numeric ?
Previous Message admin 1999-12-17 22:04:54 making 'like' queries quicker