Quick Links

Re: hash semi join caused by "IN (select ...)"

From:	Scott Carey <scott(at)richrelevance(dot)com>
To:	Clemens Eisserer <linuxhippy(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: hash semi join caused by "IN (select ...)"
Date:	2011-05-18 17:41:40
Message-ID:	C9F95085.373D6%scott@richrelevance.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 5/17/11 12:38 AM, "Clemens Eisserer" <linuxhippy(at)gmail(dot)com> wrote:

>Hi,
>
>>> select .... from t1 left join t2 .... WHERE id IN (select ....)
>>
>> Does it work as expected with one less join? If so, try increasing
>> join_collapse_limit ...
>
>That did the trick - thanks a lot. I only had to increase
>join_collapse_limit a bit and now get an almost perfect plan.
>Instead of hash-joining all the data, the planner generates
>nested-loop-joins with index only on the few rows I fetch.
>
>Using = ANY(array(select... )) also seems to work, I wonder which one
>works better. Does ANY(ARRAY(...)) force the optimizer to plan the
>subquery seperated from the main query?

I'm not sure exactly what happens with ANY(ARRAY()).

I am fairly confident that the planner simply transforms an IN(select ...)
to a join, since they are equivalent.

Because "foo IN (select ...)" is just a join, it counts towards
join_collapse_limit.

>
>Thanks, Clemens
>
>--
>Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-performance

In response to

Re: hash semi join caused by "IN (select ...)" at 2011-05-17 07:38:55 from Clemens Eisserer

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Greg Smith	2011-05-19 03:00:31	Re: reducing random_page_cost from 4 to 2 to force index scan
Previous Message	Pavel Stehule	2011-05-18 15:59:47	Re: LIMIT and UNION ALL