From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Ryan Holmes <ryan(at)hyperstep(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: IN operator causes sequential scan (vs. multiple OR expressions) |
Date: | 2007-01-28 01:56:39 |
Message-ID: | 26081.1169949399@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Ryan Holmes <ryan(at)hyperstep(dot)com> writes:
> So, yes, disabling seqscan does force an index scan for the IN
> version. My question now is, how do I get PostgreSQL to make the
> "right" decision without disabling seqscan?
I pinged you before because in a trivial test case I got
indexscans out of both text and varchar cases:
regression=# create table foo (f1 text unique, f2 varchar(25) unique);
NOTICE: CREATE TABLE / UNIQUE will create implicit index "foo_f1_key" for table "foo"
NOTICE: CREATE TABLE / UNIQUE will create implicit index "foo_f2_key" for table "foo"
CREATE TABLE
regression=# explain select * from foo where f1 in ('foo', 'bar');
QUERY PLAN
-------------------------------------------------------------------------
Bitmap Heap Scan on foo (cost=4.52..9.86 rows=2 width=61)
Recheck Cond: (f1 = ANY ('{foo,bar}'::text[]))
-> Bitmap Index Scan on foo_f1_key (cost=0.00..4.52 rows=2 width=0)
Index Cond: (f1 = ANY ('{foo,bar}'::text[]))
(4 rows)
regression=# explain select * from foo where f2 in ('foo', 'bar');
QUERY PLAN
-------------------------------------------------------------------------------------
Bitmap Heap Scan on foo (cost=6.59..17.27 rows=10 width=61)
Recheck Cond: ((f2)::text = ANY (('{foo,bar}'::character varying[])::text[]))
-> Bitmap Index Scan on foo_f2_key (cost=0.00..6.59 rows=10 width=0)
Index Cond: ((f2)::text = ANY (('{foo,bar}'::character varying[])::text[]))
(4 rows)
But on closer inspection the second case is not doing the right thing:
notice the rowcount estimate is 10, whereas it should be only 2 because
of the unique index on f2. I poked into it and realized that in 8.2
scalararraysel() fails to deal with binary-compatible datatype cases,
instead falling back to a not-very-bright generic estimate.
I've committed a fix for 8.2.2, but in the meantime maybe you could
change your varchar column to text?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Ryan Holmes | 2007-01-28 02:31:45 | Re: IN operator causes sequential scan (vs. multiple OR expressions) |
Previous Message | Ryan Holmes | 2007-01-28 01:34:12 | Re: IN operator causes sequential scan (vs. multiple OR expressions) |