From: | Joshua Tolley <eggyknap(at)gmail(dot)com> |
---|---|
To: | Thomas Hamilton <thomashamilton76(at)yahoo(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #5231: SELECT DISTINCT poorly implemented vs SELECT ... GROUP BY |
Date: | 2009-12-03 16:24:52 |
Message-ID: | 20091203162452.GC19666@eddie |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, Dec 03, 2009 at 03:56:05PM +0000, Thomas Hamilton wrote:
>
> The following bug has been logged online:
>
> Bug reference: 5231
> Logged by: Thomas Hamilton
> Email address: thomashamilton76(at)yahoo(dot)com
> PostgreSQL version: 8.3.8
> Operating system: Ubuntu 4.2.4
> Description: SELECT DISTINCT poorly implemented vs SELECT ... GROUP
> BY
> Details:
>
> SELECT DISTINCT does a Sort followed by Unique.
>
> SELECT ... GROUP BY, which is logically equivalent, performs a
> HashAggregate.
>
> When run against a large dataset with a small number of distinct results
> HashAggregate is an order of magnitude more efficient!
>
> Since the spec does not require DISTINCT to return sorted results, I don't
> believe Sort ... Unique will ever be more efficient than HashAggregate.
>
> Therefore, in order to maximize performance, DISTINCT should always be
> implemented as HashAggregate.
In 8.4 and above, SELECT DISTINCT can use HashAggregates rather than
Sort/Unique -- though I'm not sure it always does. Anyway, an upgrade should
handle this for you. However, this isn't really a bug, it's a feature, so
don't expect to see it in an 8.3.x version.
-- Josh / eggyknap
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2009-12-03 17:49:51 | Re: BUG #5231: SELECT DISTINCT poorly implemented vs SELECT ... GROUP BY |
Previous Message | Thomas Hamilton | 2009-12-03 15:56:05 | BUG #5231: SELECT DISTINCT poorly implemented vs SELECT ... GROUP BY |