From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Greg Stark <gsstark(at)mit(dot)edu> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-general(at)postgresql(dot)org, Ben Harper <rogojin(at)gmail(dot)com> |
Subject: | Re: SELECT DISTINCT very slow |
Date: | 2009-07-10 00:50:40 |
Message-ID: | 10703.1247187040@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Greg Stark <gsstark(at)mit(dot)edu> writes:
> Not really. The OP doesn't say how wide the record rows are but unless
> they're very wide it wouldn't pay to use an index for this even if you
> didn't have to access the heap also. It's going to be faster to scan
> the whole heap and either sort or use a hash. Currently there aren't
> many cases where a btree with 6,000 copies of 111 distinct keys is
> going to be useful.
It was 600,000 not 6,000 ... so a skip-scan might be worth the trouble,
but as you say we haven't done it.
In any case I think the real issue is that the OP is probably using a
pre-8.4 release which will always do SELECT DISTINCT via sort-and-unique.
Hash aggregation would be a whole lot faster for these numbers, even
if not exactly instantaneous. He could update to 8.4, or go over to
using GROUP BY as was recommended upthread.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | serafin segador | 2009-07-10 01:00:24 | Re: pg_dump PostgreSQL 8.4 |
Previous Message | Greg Stark | 2009-07-10 00:36:38 | Re: SELECT DISTINCT very slow |