Re: Have I b0rked something? Slow comparisons on "where x in (...)"

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Stephen Harris" <lists(at)spuddy(dot)org>, "Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Have I b0rked something? Slow comparisons on "where x in (...)"
Date: 2007-05-02 19:45:08
Message-ID: D425483C2C5C9F49B5B7A41F8944154701000625@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Have you done a vacuum on the table recently?

I would be curious to see how:

select stuff from table
where index_key = <key1> AND non_index_row in ('xyz','abc','def')
UNION ALL
select stuff from table
where index_key = <key2> AND non_index_row in ('xyz','abc','def')
...
UNION ALL
select stuff from table
where index_key = <key600> AND non_index_row in ('xyz','abc','def')

performs by comparison. If, after a vacuum, it performs better than the
IN list, then the IN list might benefit from a bit of analysis for
better tuning chances.

> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-
> owner(at)postgresql(dot)org] On Behalf Of Stephen Harris
> Sent: Wednesday, May 02, 2007 11:32 AM
> To: Postgres General
> Subject: [GENERAL] Have I b0rked something? Slow comparisons on "where
x
> in (...)"
>
> Postgres version 8.0.9 on Solaris 2.8. I know it's old but...
>
> I have a table with a million rows.
>
> I need to select data from this table based on an indexed column; I
need
> to select 600 possible values from the column, returning around 24,000
> rows of data.
>
> In perl I have a hash which has 600 key values in it.
>
> So I did:
>
> "select stuff from table where index_key in (" .
> join(",",keys %hash) . ") AND non_index_row in
('xyz','abc','def')
>
> And in the perl
> while(fetch())
> {
> do_stuff
> }
>
> This resulted in a query string which executed in 12 minutes. If I
> did an "explain" on the query string then I can see it was being
expanded
> to 600 OR statements
> where (index_key = 1) OR (index_key = 2) OR .....
>
>
> Now as an alternative option I did
> select stuff from table where non_index_row in ('xyz','abc','def')
> and in the perl
> while(fetch())
> {
> next unless $hash{$_->{index_key}};
> do_stuff
> }
>
> To me this should be slower since we're selecting more rows, throwing
> the data back to the perl and then discarding values I didn't want.
>
> Imagine my surprise when the result took 3 minutes to execute.
>
> Have I broken something, somewhere? Or are IN comparisons really that
> slow?
>
> For what it's worth, a simple
>
> explain select count(*) from table where index_key in (1,2,3,4,....)
>
> uses the index up until 156 values but then switches to sequential
scan
> when there are 157 or more values in query.
>
> Any thoughts? I fear my poor tuning attempts may have caused other
> slow downs!
>
> --
>
> rgds
> Stephen
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that
your
> message can get through to the mailing list cleanly

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Harris 2007-05-02 19:51:17 Re: Have I b0rked something? Slow comparisons on "where x in (...)"
Previous Message Stephen Harris 2007-05-02 18:31:37 Have I b0rked something? Slow comparisons on "where x in (...)"