Re: SQL WHERE: many sql or large IN()

From: tom <tom(at)tacocat(dot)net>
To: Postgresql <pgsql-general(at)postgresql(dot)org>
Subject: Re: SQL WHERE: many sql or large IN()
Date: 2007-04-06 13:23:02
Message-ID: E96D3B35-2D70-48BE-AF10-3E7F48ADA6C5@tacocat.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

If I read this right, intarray is for reading values from an array
data type.

I don't have this.

I have a varchar() field that is indexed (unique), call it 'foo'

I have a choice of running:

SELECT bar FROM tokens WHERE foo IN ('apple','orange','biscuit'....)
for up to ~300 words

OR

SELECT bar FROM tokens WHERE foo = 'apple' up to ~300 times as a
prepared/cached SQL statements.

On Apr 6, 2007, at 9:10 AM, Oleg Bartunov wrote:

> Tom,
>
> have you seen contrib/intarray ?
>
> Oleg
> On Fri, 6 Apr 2007, tom wrote:
>
>> I'm wondering where the differences are in running two different
>> types of SQL statements.
>>
>> Given ~300 tokens/words I can either run 1 sql statement with a
>> large list in a "WHERE foo IN (...300 tokens...)"
>> or I can run ~300 statements, one for each token.
>> In the first case, the SQL is not prepared, but just executed.
>> In the second case, the SQL is prepared and run as a cached
>> execution plan (I think).
>>
>> Now. It would seem that the second approach would be painfully
>> slow. But I'm not sure that I'm seeing this.
>> Currently I have <5 users. As always, this might change...
>>
>> Before I start going about coding and testing lots of stuff I
>> thought I would ask for some historical experiences someone might
>> have had when comparing these two approaches and if there are
>> inflection points between the performance in terms of the number
>> of tokens or simultaneous users.
>>
>> I should add that the tokens are either indexed or primary indexed
>> but in both cases, unique, and not guaranteed to exist in every case.
>>
>> Initially it seems that the WHERE IN (...) approach takes a turn
>> for the worse when the list gets very large.
>> It also seems to do comparatively worse when the number of tokens
>> is very small.
>> But I can't claim any scientifically sound basis for making this
>> distinction.
>>
>> Any experiences someone would like to share?
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>> message can get through to the mailing list cleanly
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Oleg Bartunov 2007-04-06 13:35:54 Re: SQL WHERE: many sql or large IN()
Previous Message Oleg Bartunov 2007-04-06 13:10:48 Re: SQL WHERE: many sql or large IN()