Re: SELECT DISTINCT

From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: "salah jubeh" <s_jubeh(at)yahoo(dot)com>,"pgsql" <pgsql-general(at)postgresql(dot)org>
Subject: Re: SELECT DISTINCT
Date: 2013-01-18 00:06:52
Message-ID: 20130118000652.108980@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

salah jubeh wrote:

> During my work, I have seen a common practice of using DISTINCT.
> Some will argue that developer should know the effect of using
> it, but keep in mind not all developers are gurus in RDBMs.

"SELECT DISTINCT eliminates duplicate rows from the result."
Personally, I would not want to keep a programmer who could not
grasp that concept.

> Using DISTINCT might lead to a huge performance  degradation
> because of sort and filter or hashaggregate operations.

More than that, I have often seen it added when a JOIN was
inadequately constrained and the programmer saw duplicates in the
output and added DISTINCT in response. The problem you have beyond
performance in such cases is that it is usually not showing correct
results; and worse, they are wrong but *plausible*. I would not
want to encourage that kind of sloppy thinking.

> I think any query that returns a unique column (primary key,
> unique) which is not duplicated in some way (join) can use this
> optimisation technique.

I agree that if the planner searched for that, there would be cases
where the DISTINCT keyword could be determined to be a noise word.
The problem with that is that such searching in the planner would
not be free -- doing it accurately would increase planning cost for
every query which was legitimately using the feature. The community
is generally loath to add runtime costs to properly written queries
to try to minimize the penalty paid by those who specify features
they don't need.

-Kevin

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2013-01-18 01:02:44 Re: SELECT DISTINCT
Previous Message Edson Richter 2013-01-17 22:25:12 Loggin SQL warnings in JDBC driver