From: | "Jim C(dot) Nasby" <jim(at)nasby(dot)net> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Optimizer not using index on 120M row table |
Date: | 2003-04-08 03:29:03 |
Message-ID: | 20030407222903.M31861@flake.decibel.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
As background: this is the main table for http://stats.distributed.net
Table "public.email_contrib"
Column | Type | Modifiers
------------+---------------+-----------
id | integer | not null
team_id | integer | not null
date | date | not null
project_id | smallint | not null
work_units | numeric(20,0) | not null
Indexes: email_contrib_pkey primary key btree (project_id, id, date)
id is the id of a participant, team_id is the team they were on for that
day, date is the day the work was done, project_id is the project, and
work_units is the amount of work done.
explain select * from email_contrib where project_id=8 and id=39622 and
date='3/1/03';
QUERY PLAN
-------------------------------------------------------------------------------
Seq Scan on email_contrib (cost=0.00..2942185.40 rows=1 width=25)
Filter: ((project_id = 8) AND (id = 39622) AND (date =
'2003-03-01'::date))
(2 rows)
Is there any reason why this shouldn't be using the index? The
selectivity on project_id is very low (only 5 values for all 120M rows).
select attname, avg_width, n_distinct, correlation from pg_stats where
tablename='email_contrib';
attname | avg_width | n_distinct | correlation
------------+-----------+------------+-------------
id | 4 | 95184 | 0.496598
team_id | 4 | 1361 | 0.219478
date | 4 | 1769 | 0.329469
project_id | 2 | 5 | 1
work_units | 11 | 2100 | 0.0900541
It seems that not only should the query I explained be using the index,
but it seems that something like select sum(work_units) .. where
project_id=8 and id=39622 should also use the index.
I've run vacuum analyze to no effect.
On a related note, will pgsql do 'index covering', reading only the
index if it contains all the information a query needs? IE: in Sybase,
this query will only hit the index on Email_Contrib:
select date from email_contrib where project_id=8 and id=39622;
because email_contrib_pkey contains all required values.
--
Jim C. Nasby (aka Decibel!) jim(at)nasby(dot)net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828
Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"
From | Date | Subject | |
---|---|---|---|
Next Message | Neil Conway | 2003-04-08 03:49:27 | Re: Optimizer not using index on 120M row table |
Previous Message | Lamar Owen | 2003-04-08 02:51:48 | Re: Failed dependencies: perl(Pg) is needed by postgresql-contrib |