From: | Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> |
---|---|
To: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
Cc: | George Pavlov <gpavlov(at)mynewplace(dot)com>, Kynn Jones <kynnjo(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: How to "unique-ify" HUGE table? |
Date: | 2008-12-23 19:47:36 |
Message-ID: | 49514058.9090005@kaltenbrunner.cc |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Scott Marlowe wrote:
> On Tue, Dec 23, 2008 at 11:14 AM, George Pavlov <gpavlov(at)mynewplace(dot)com> wrote:
>> You don't say what PG version you are on, but just for kicks you may try
>> using GROUP BY instead of DISTINCT. Yes, the two should perform the
>> same, but with 8.1 (or maybe 8.0) I had seen situations where GROUP BY
>> was faster (admittedly this happened with more complex queries). So, try
>> this:
>
> Even in 8.3 it looks like group by is faster. Tested it on a decent
> sized table and group by used a hash agg and ran in ~600 ms, while
> distinct used a sort and ran in 1300 ms. That was on 500k rows. On a
> much larger table, one with about 10M rows, a similar statement runs
> in 1500 ms with group by and in 2390 ms when run with distinct.
Not surprising - this is a known limitation in all released versions of
postgresql (GROUP BY can use hashing and sorting - DISTINCT only
sorting). 8.4 is going to improve that though.
Stefan
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2008-12-24 14:14:06 | Re: dbt-2 tuning results with postgresql-8.3.5 |
Previous Message | Scott Marlowe | 2008-12-23 18:37:07 | Re: How to "unique-ify" HUGE table? |