| From: | TimothyReaves(at)westfieldgrp(dot)com |
|---|---|
| To: | pgsql-sql(at)postgresql(dot)org |
| Subject: | Re: How slow is DISTINCT? |
| Date: | 2002-02-28 13:45:25 |
| Message-ID: | OF6974336F.BD066288-ON85256B6E.004B7A3B@westfield-cos.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-sql |
Wei Weng wrote:
> Josh Berkus wrote:
>
>>Wei Wang,
>>
>>
>>
>>>How exactly slow is DISTINCT being processed in SQL engines? (not
>>>limited to postgresql, though comments on postgresql would be most
>>>relevant)
>>>
>>>
>>I can only give you a relative result, based exlusively on my anecdotal
>> experience with 7.1:
>>
>>Fast: SELECT ...
>>Slower: SELECT ... GROUP BY x,y,z
>> or: SELECT DISCTINCT ON (x) ... (Postgres non-standard extension)
>>SLowest: SELECT DISTINCT ...
>>
>>The reason for this is that SELECT DISTINCT is effectively a GROUP BY
>> on all result fields of the query, and if a few of the aren't indexed
>> that requires a seq scan.
>>
> What if I do thing like
>
> SELECT DISTINCT table1.tid, table1.name, table1.description FROM ...
>
> Does it equal to the scenario 2 or 3?
>
> I am thinking SELECT DISTINCT table1.tid is just a variation (or the
> other way around) of SELECT DISTINCT ON (table1.tid), is that right?
>
> Thanks
>
>
Of course, a more basic question is, why so much repeated data?
Perhaps you should look at the design in your tables. I'm not assuming
they are wrong, only suggesting that you look.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Céline Rivière | 2002-02-28 14:44:31 | Stored procedures returning TEXT |
| Previous Message | Karel Zak | 2002-02-28 11:55:02 | Re: backward SQL query |