Quick Links

Re: How slow is DISTINCT?

From:	TimothyReaves(at)westfieldgrp(dot)com
To:	pgsql-sql(at)postgresql(dot)org
Subject:	Re: How slow is DISTINCT?
Date:	2002-02-28 13:45:25
Message-ID:	OF6974336F.BD066288-ON85256B6E.004B7A3B@westfield-cos.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

Wei Weng wrote:

> Josh Berkus wrote:
>
>>Wei Wang,
>>
>>
>>
>>>How exactly slow is DISTINCT being processed in SQL engines? (not
>>>limited to postgresql, though comments on postgresql would be most
>>>relevant)
>>>
>>>
>>I can only give you a relative result, based exlusively on my anecdotal
>> experience with 7.1:
>>
>>Fast: SELECT ...
>>Slower: SELECT ... GROUP BY x,y,z
>> or: SELECT DISCTINCT ON (x) ... (Postgres non-standard extension)
>>SLowest: SELECT DISTINCT ...
>>
>>The reason for this is that SELECT DISTINCT is effectively a GROUP BY
>> on all result fields of the query, and if a few of the aren't indexed
>> that requires a seq scan.
>>
> What if I do thing like
>
> SELECT DISTINCT table1.tid, table1.name, table1.description FROM ...
>
> Does it equal to the scenario 2 or 3?
>
> I am thinking SELECT DISTINCT table1.tid is just a variation (or the
> other way around) of SELECT DISTINCT ON (table1.tid), is that right?
>
> Thanks
>
>

Of course, a more basic question is, why so much repeated data?
Perhaps you should look at the design in your tables. I'm not assuming
they are wrong, only suggesting that you look.

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Céline Rivière	2002-02-28 14:44:31	Stored procedures returning TEXT
Previous Message	Karel Zak	2002-02-28 11:55:02	Re: backward SQL query