From: | Jack Orenstein <jack(dot)orenstein(at)hds(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Postgres optimizer choosing wrong index |
Date: | 2008-10-23 15:40:06 |
Message-ID: | 49009AD6.9020902@hds.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I'm using postgres 7.4 and having a problem with the query optimizer. Our table,
T, looks like this:
dh int
fh int
nm int
... -- other columns
A typical row is 400-500 bytes.
T has two indexes, idx_df on (dh, fh) and idx_dn on (dh, nm).
My query is
select * from T
where dh = ? and fh = ?
If I run EXPLAIN on this query, (plugging in values 1 and 2 for the variables),
before VACUUM ANALYZE, I get the desired execution plan:
Index Scan using idx_df on T (cost=0.00..4.83 rows=1 width=454)
Index Cond: ((dh = 1) AND (fh = 2))
But after VACUUM ANALYZE:
Index Scan using idx_dn on T (cost=0.00..5.27 rows=1 width=561)
Index Cond: (dh = 1)
Filter: (fh = 2)
Notice that postgres is now using the other index. This behavior is somewhat
dependent on the values plugged in. I ran a query to count dh values:
select dir_hash, count(*) from external_file group by dir_hash;
dh | count
------------+--------
916645488 | 20000
1057692240 | 200000
And if I use 1057692240 in the EXPLAIN, I get the desired plan:
Index Scan using idx_df on external_file (cost=0.00..5.27 rows=1 width=561)
Index Cond: ((dir_hash = 1057692240) AND (fn_hash = 2))
I've tried playing with various cost settings (e.g. random_page_cost), but have
been unable to influence the optimizer's behavior. Rewriting the query as ...
where (dh, fh) = (?, ?) doesn't help.
So a few questions:
- Why would the optimizer ever choose idx_dn over idx_df given that idx_df has
to be more selective?
- Is there any way to force the use of idx_df?
Jack Orenstein
P.S. Yes, I know, 7.4. We're upgrading to 8.3 but we have this problem right now.
From | Date | Subject | |
---|---|---|---|
Next Message | Ivan Sergio Borgonovo | 2008-10-23 16:05:18 | overhead of plpgsql functions over simple select |
Previous Message | Aaron | 2008-10-23 13:54:45 | Re: Storing questionnaire data |