Speeding up select distinct

From: Laurent Martelli <laurent(at)aopsys(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Speeding up select distinct
Date: 2005-03-16 17:58:35
Message-ID: 87k6o74gms.fsf@stan.aopsys
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Consider this query:

SELECT distinct owner from pictures;

Unique (cost=361.18..382.53 rows=21 width=4) (actual time=14.197..17.639 rows=21 loops=1)
-> Sort (cost=361.18..371.86 rows=4270 width=4) (actual time=14.188..15.450 rows=4270 loops=1)
Sort Key: "owner"
-> Seq Scan on pictures (cost=0.00..103.70 rows=4270 width=4) (actual time=0.012..5.795 rows=4270 loops=1)
Total runtime: 19.147 ms

I thought that 19ms to return 20 rows out of a 4000 rows table so I
added an index:

CREATE INDEX pictures_owner ON pictures (owner);

It gives a slight improvement:

Unique (cost=0.00..243.95 rows=21 width=4) (actual time=0.024..10.293 rows=21 loops=1)
-> Index Scan using pictures_owner on pictures (cost=0.00..233.27 rows=4270 width=4) (actual time=0.022..8.227 rows=4270 loops=1)
Total runtime: 10.369 ms

But still, it's a lot for 20 rows. I looked at other type of indexes,
but they seem to either not give beter perfs or be irrelevant.

Any ideas, apart from more or less manually maintaining a list of
distinct owners in another table ?

--
Laurent Martelli
laurent(at)aopsys(dot)com Java Aspect Components
http://www.aopsys.com/ http://jac.objectweb.org

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2005-03-16 18:03:10 Re: cpu_tuple_cost
Previous Message David Gagnon 2005-03-16 17:02:04 Re: Performance problem on delete from for 10k rows. May