| From: | "Scott Marlowe" <smarlowe(at)qwest(dot)net> | 
|---|---|
| To: | "Anton Maksimenkov" <engineer(at)hlebprom(dot)ru> | 
| Cc: | pgsql-general(at)postgresql(dot)org | 
| Subject: | Re: Why index used/not used | 
| Date: | 2004-07-21 08:14:49 | 
| Message-ID: | 1090397689.709.32.camel@localhost.localdomain | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
On Tue, 2004-07-20 at 23:00, Anton Maksimenkov wrote:
>   Hello.
> 
> Explain.
>   I have table "traf_raw" contains field "sip_id" (integer). This field 
> indexed with "CREATE INDEX traf_raw_sip ON traf_raw (sip_id)".
> 
> Question.
>   When I try to get different rows postgres use index with one "sip_id" 
> and not use index with another "sip_id". I don't understand why it is 
> happen, but with more complex queries Seq Scan is so slowly.
> 
> Example.
>   With "sip_id='19'" there many rows in table, with "sip_id='29'" there 
> is no rows.
> 
> cnupm=> ANALYZE traf_raw;
> ANALYZE
> cnupm=> EXPLAIN ANALYZE SELECT * FROM traf_raw WHERE sip_id='19' LIMIT 
> 10 OFFSET 100000;
>                                                       QUERY PLAN 
> 
> ---------------------------------------------------------------------------------------------------------------------
>   Limit  (cost=5230.95..5230.99 rows=1 width=56) (actual 
> time=2505.89..2505.89 rows=0 loops=1)
>     ->  Seq Scan on traf_raw  (cost=0.00..5230.99 rows=10808 width=56) 
> (actual time=0.04..2490.02 rows=10977 loops=1)
>           Filter: (sip_id = 19)
>   Total runtime: 2505.95 msec
> (4 rows)
Note that in this instance, your query is returning >10k rows.
While in this example:
> cnupm=> EXPLAIN ANALYZE SELECT * FROM traf_raw WHERE sip_id='29' LIMIT 
> 10 OFFSET 100000;
>                                                             QUERY PLAN 
> 
> ---------------------------------------------------------------------------------------------------------------------------------
>   Limit  (cost=391.39..392.70 rows=1 width=56) (actual time=43.08..43.08 
> rows=0 loops=1)
>     ->  Index Scan using traf_raw_sip on traf_raw  (cost=0.00..392.70 
> rows=99 width=56) (actual time=43.07..43.07 rows=0 loops=1)
>           Index Cond: (sip_id = 29)
>   Total runtime: 43.16 msec
> (4 rows)
you are only returning 0 rows.
PostgreSQL uses a cost based planner.  So, in the first instance, it
thought it was returning enough rows to justify grabbing all the rows
first.
There are several settings that tune the planner to tell it when to
switch from an index scan to a seq scan.  random_page_cost is the most
commonly adusted one.  Try dropping it from the default of 4 to
something around 1.4 to 2.0 or so.  
Also, you can force the planner to not use seq scans unless it has to by
setting enable_seqscan to off:
set enable_seqscan = off;
and run the query again.  Test several of your queries and find out
where the turning point is, and then adjust the random_page_cost to
cause it to switch at about the right time.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Anton Maksimenkov | 2004-07-21 08:20:44 | Re: Why index used/not used | 
| Previous Message | Markus Wollny | 2004-07-21 07:36:30 | Re: tsearch2, ispell, utf-8 and german special characters |