Re: How slow is DISTINCT?

From: TimothyReaves(at)westfieldgrp(dot)com
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: How slow is DISTINCT?
Date: 2002-02-28 13:45:25
Message-ID: OF6974336F.BD066288-ON85256B6E.004B7A3B@westfield-cos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Wei Weng wrote:

> Josh Berkus wrote:
>
>>Wei Wang,
>>
>>
>>
>>>How exactly slow is DISTINCT being processed in SQL engines? (not
>>>limited to postgresql, though comments on postgresql would be most
>>>relevant)
>>>
>>>
>>I can only give you a relative result, based exlusively on my anecdotal
>> experience with 7.1:
>>
>>Fast: SELECT ...
>>Slower: SELECT ... GROUP BY x,y,z
>> or: SELECT DISCTINCT ON (x) ... (Postgres non-standard extension)
>>SLowest: SELECT DISTINCT ...
>>
>>The reason for this is that SELECT DISTINCT is effectively a GROUP BY
>> on all result fields of the query, and if a few of the aren't indexed
>> that requires a seq scan.
>>
> What if I do thing like
>
> SELECT DISTINCT table1.tid, table1.name, table1.description FROM ...
>
> Does it equal to the scenario 2 or 3?
>
> I am thinking SELECT DISTINCT table1.tid is just a variation (or the
> other way around) of SELECT DISTINCT ON (table1.tid), is that right?
>
> Thanks
>
>

Of course, a more basic question is, why so much repeated data?
Perhaps you should look at the design in your tables. I'm not assuming
they are wrong, only suggesting that you look.

Browse pgsql-sql by date

  From Date Subject
Next Message Céline Rivière 2002-02-28 14:44:31 Stored procedures returning TEXT
Previous Message Karel Zak 2002-02-28 11:55:02 Re: backward SQL query