From: | Rob Messer <rrmesser(at)yahoo(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Optimizer not using index when it should |
Date: | 2003-04-29 09:03:01 |
Message-ID: | 20030429090301.13152.qmail@web41213.mail.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Like some other recent messages, this one is about getting postgresql
to use an index that it seems it should clearly use. (But this one has
nothing to do with count(*)).
Here is my table:
Column | Type | Modifiers
--------------+-----------------------------+-----------
dsid | character varying(20) | not null
recid | numeric(8,0) | not null
trans_id | character varying(16) | not null
status | character varying(1) |
init_ts | timestamp without time zone |
last_ts | timestamp without time zone |
last_form_id | character varying(8) |
secval | character varying(20) |
Indexes: ds_rec1 unique btree (recid),
ds_rec2 btree (dsid)
Here is my version info:
PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2
20020903 (Red Hat Linux 8.0 3.2-7)
Currently the table ds_record has about 250,000 records in it. Of
those, about 3000 have dsid = 'starz'. When I need to look up all the
recids with this dsid in ds_record, I have the following simple query:
select recid from ds_record where dsid = 'startz';
But it doesn't use the index ds_rec2 on dsid. Here is the explain
analyze output:
intellis2=> explain analyze select recid from ds_record where dsid =
'starz';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Seq Scan on ds_record (cost=0.00..6186.21 rows=3484 width=12) (actual
time=10.60..408.12 rows=3484 loops=1)
Filter: (dsid = 'starz'::character varying)
Total runtime: 410.14 msec
(3 rows)
but if I turn off seqscan I get this:
intellis2=> set enable_seqscan=off;
SET
intellis2=> explain analyze select recid from ds_record where dsid =
'starz';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Index Scan using ds_rec2 on ds_record (cost=0.00..7185.47 rows=3484
width=12)
(actual time=0.17..12.94 rows=3484 loops=1)
Index Cond: (dsid = 'starz'::character varying)
Total runtime: 14.97 msec
(3 rows)
so it is faster by more than a factor of 25 to use the index. The
problem gets worse when I add a join to the table.
I have tried the following:
alter table ds_record alter dsid set statistics 1000;
vacuum analyze ds_record;
drop index ds_rec2;
CREATE INDEX ds_rec2 ON ds_record USING btree (dsid);
But to no avail, I get the same results.
Interestingly, for queries that return fewer rows it does use the
correct index. For example, dsid="mapbuy2" appears about 500 times in
ds_record. Here is the explain out there (with enable_seqscan back
on):
intellis2=> explain analyze select recid from ds_record where dsid =
'mapbuy2';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Index Scan using ds_rec2 on ds_record (cost=0.00..1351.17 rows=522
width=12) (actual time=0.18..4.31 rows=522 loops=1)
Index Cond: (dsid = 'mapbuy2'::character varying)
Total runtime: 4.68 msec
To me it seems that the threshold for doing a table scan is wrong --
when the rows retrieved are about 1.25% of the table it does a scan.
What can I do to fix this -- is there something I am missing about
setting statistics or some configuration variable I can change? Any
insights would be greatly appreciated. Thank you,
Rob Messer
__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com
From | Date | Subject | |
---|---|---|---|
Next Message | Rajesh Kumar Mallah | 2003-04-29 11:12:24 | Re: Is 292 inserts/sec acceptable performance ? |
Previous Message | Shridhar Daithankar | 2003-04-29 07:25:15 | Re: Is 292 inserts/sec acceptable performance ? |