Re: Have I b0rked something? Slow comparisons on "where x in (...)"

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Harris <lists(at)spuddy(dot)org>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Have I b0rked something? Slow comparisons on "where x in (...)"
Date: 2007-05-02 21:59:49
Message-ID: 21786.1178143189@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Stephen Harris <lists(at)spuddy(dot)org> writes:
> Postgres version 8.0.9 on Solaris 2.8. I know it's old but...
> I have a table with a million rows.

> I need to select data from this table based on an indexed column; I need
> to select 600 possible values from the column, returning around 24,000
> rows of data.

> In perl I have a hash which has 600 key values in it.

> So I did:

> "select stuff from table where index_key in (" .
> join(",",keys %hash) . ") AND non_index_row in ('xyz','abc','def')

> And in the perl
> while(fetch())
> {
> do_stuff
> }

> This resulted in a query string which executed in 12 minutes. If I
> did an "explain" on the query string then I can see it was being expanded
> to 600 OR statements
> where (index_key = 1) OR (index_key = 2) OR .....

In what, a seq scan?

That plan will require executing 600 integer comparisons at each of a
million rows, with only some trivial fraction avoided because of early
success. So it works out that your machine is able to do something over
800K such comparisons per second, which seems a bit slow for any modern
machine ... but I note 8.0 didn't have any of the "virtual slot"
optimizations added in later releases, and is doing a fresh
heap_getattr() for each access to the variable. If it's having to
grovel over a lot of variable-width fields to get to that field each
time, I can see where the time might get eaten up. Where is the
index_key column in the tuples, exactly?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jason Nerothin 2007-05-02 22:44:42 Re: large table problem
Previous Message Tom Lane 2007-05-02 21:38:16 Re: forcing use of a specific (expression) index?