From: | Joel Burton <joel(at)joelburton(dot)com> |
---|---|
To: | "David F(dot) Skoll" <dfs(at)roaringpenguin(dot)com> |
Cc: | pgsql-admin(at)postgresql(dot)org |
Subject: | Re: What kind of index to use for many rows with few unique values? |
Date: | 2002-12-02 23:04:38 |
Message-ID: | 20021202230438.GA16554@temp.joelburton.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
On Mon, Dec 02, 2002 at 05:10:00PM -0500, David F. Skoll wrote:
> Hi,
>
> I have a table with a column called "state". Each row can be in one
> of four states, let's call them 'new', 'pending', 'ok', and 'bad'.
> On average, about 95% of the rows will be 'bad', with the remaining
> 5% being in one of the other three states. If the table has 50K rows
> and I just want to pull out the 'ok' rows, I don't want to do a sequential
> scan. To pull out the 'bad' rows, obviously, sequential scan is fine.
>
> I've heard that a btree index performs badly in this situation. Is
> a hash index appropriate? I've heard bad things about hash indexes in
> PostgreSQL.
create table states (id serial primary key, state varchar(10), t text );
create function fill_states(varchar, int) returns bool as '
begin for i in 1 .. $2
loop
insert into states (state, t) values ($1, ''random'');
end loop;
return true;
end;
' language plpgsql;
select fill_states('ok',45000);
select fill_states('bad',5000);
select fill_states('warning',5000);
analyze states;
joel(at)joel=# explain select * from states where state='warning';
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using state_idx on states (cost=0.00..1013.00 rows=5533 width=20)
Index Cond: (state = 'warning'::character varying)
(2 rows)
joel(at)joel=# explain select * from states where state='ok';
QUERY PLAN
--------------------------------------------------------------
Seq Scan on states (cost=0.00..1171.51 rows=44627 width=20)
Filter: (state = 'ok'::character varying)
(2 rows)
Looks right to me: index scan for the less-common option, seqscan for
the most common. Why don't you think this, as a btree, will work for
you?
--
Joel BURTON | joel(at)joelburton(dot)com | joelburton.com | aim: wjoelburton
Independent Knowledge Management Consultant
From | Date | Subject | |
---|---|---|---|
Next Message | David F. Skoll | 2002-12-02 23:14:49 | Re: What kind of index to use for many rows with few unique |
Previous Message | David F. Skoll | 2002-12-02 22:10:00 | What kind of index to use for many rows with few unique values? |