Re: slow IN() clause for many cases

From: Andrew - Supernews <andrew+nonews(at)supernews(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: slow IN() clause for many cases
Date: 2005-10-12 21:17:51
Message-ID: slrndkqvbv.2db7.andrew+nonews@trinity.supernews.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2005-10-12, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andrew - Supernews <andrew+nonews(at)supernews(dot)com> writes:
>> As the number of items in the IN clause increases, the planning time grows
>> rather radically.
>
> I was looking at this yesterday. There is some O(N^2) behavior in
> create_bitmap_subplan, stemming from trying to remove duplicated qual
> conditions. That strikes me as a relatively useless activity, and I was
> thinking the easiest answer might be to just delete that "optimization".

Well, the behaviour is definitely bad.

For comparison, on 8.0 the IN (list of 1000 items) version plans only about
four times slower than IN (select array...), and again the execution times
are comparable. But for the case of a real-world app that uses IN () a lot
(dspam), I've had reports that the performance improvements from switching
to the array method are even more substantial than my raw timings suggest.

>> The actual execution time of these two is very close, with the second
>> being about 10% slower on my system (31ms vs 34ms, based on \timing values
>> from psql and averaged over several goes). However, the timings returned
>> from EXPLAIN ANALYZE are much more skewed: 42ms vs 66ms as reported in the
>> "total runtime" line. So not only is the planning time different, but also
>> the instrumentation overhead of EXPLAIN ANALYZE is wildly different between
>> the two forms.
>
> Yeah, this isn't all that surprising because of the larger number of
> plan nodes involved in the bitmap plan.

I think you have this backwards - the instrumentation overhead is _lower_
for the bitmap-OR plan than for the nestloop. This was also true in 8.0;
the instrumentation overhead of an index OR scan plan is lower than the
nestloop plan, though not by nearly as much.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Eric Sproul 2005-10-12 21:40:00 Re: 8.1 beta1 -> beta2 upgrade question
Previous Message Martijn van Oosterhout 2005-10-12 21:00:38 Re: database vacuum from cron hanging