| From: | Leandro Fanzone <leandro(at)hasar(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | postgreSQL <pgsql-novice(at)postgresql(dot)org> | 
| Subject: | Re: Secondary indexes | 
| Date: | 2001-02-08 20:14:45 | 
| Message-ID: | 3A82FE35.7EC81619@hasar.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-novice | 
You were right: I changed those dummy values from "0" to NULL, and now it
chooses to use the index. The percentage of dummy values was near 40%. Thank
you very much.
Leandro Fanzone.
Tom Lane wrote:
> Leandro Fanzone <leandro(at)hasar(dot)com> writes:
> > Hello. I have a table with, say, three fields. The first is the ID
> > (integer, unique, primary index). The second is an optional index that
> > sometimes can be blank, or even duplicated, a varchar(13). The third
> > one is the data I want to retrieve, it has no importance in this
> > problem. I need to access sometimes by the ID and sometimes by the
> > secondary key, so I made an index using
>
> > CREATE INDEX my_index_name ON MY_TABLE(my_secondary_id);
>
> > When I select using the primary key, obviously uses the index created
> > by default.  When I select using the secondary key, it EXPLAINs me
> > that it would use sequencial scan instead of the index I created, thus
> > this search becomes extremely slow. Why the engine would ignore the
> > index?
>
> Probably because it thinks the indexscan would not be very selective.
> An indexscan that has to visit more than a few percent of the rows in
> a table is actually slower than a seqscan, typically, and so the planner
> won't choose an indexscan if it thinks a large number of rows will be
> scanned.
>
> If the secondary column has a lot of "dummy" values as you imply, it's
> important to be sure that the dummy values are NULLs, not any other
> randomly chosen value; otherwise the dummies will skew the VACUUM
> ANALYZE statistics so that the planner will think the column contains
> only a few oft-repeated values.  If it thinks that, then it's likely
> to avoid indexscans.
>
> If you need more help, please send along the exact output of EXPLAIN
> for your problem query, also the EXPLAIN result after doing "SET
> enable_seqscan TO OFF", and the results of
>
> select attname,attdisbursion,s.*
> from pg_statistic s, pg_attribute a, pg_class c
> where starelid = c.oid and attrelid = c.oid and staattnum = attnum
> and relname = 'YOURTABLENAMEHERE';
>
> so we can see what statistics the planner is looking at and what its cost
> estimates are.  (NOTE: these directions assume you are running 7.0.*)
>
>                         regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fred Schroeder | 2001-02-08 20:19:16 | Getting Started | 
| Previous Message | Tom Lane | 2001-02-08 19:22:04 | Re: Secondary indexes |