Re: DISTINCT ... ORDER BY

From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: Nabil Sayegh <postgresql(at)e-trolley(dot)de>
Cc: pgsql-novice <pgsql-novice(at)postgresql(dot)org>
Subject: Re: DISTINCT ... ORDER BY
Date: 2003-11-05 18:06:17
Message-ID: 20031105100059.L6295@megazone.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Wed, 5 Nov 2003, Nabil Sayegh wrote:

> Stephan Szabo wrote:
>
> >>should return:
> >>
> >>myDistinctOrderedCol
> >>--------------------
> >>a
> >>c
> >>b
> >>
> >>The problem is that DISTINCT complains about ORDER BY entries not being in the target list.
> >
> >
> > Right, because given this data:
> > col1 | col2
> > a | b
> > b | c
> > b | a
> >
> > what ordering should
> > select distinct col1 from tab order by col2
> > give you?
>
> I want it to just return 1 column ;)
>
> > Does it put b first because there's a col2 with a, or b second because
> > there's a col2 with b or is it indeterminate?
>
> It shall not mention col2 at all.
> The idea is to
> 1. ORDER BY expressions
> 2. Pick only 1 column
> 3. make this column distinct without losing the sort order

You can either use something like the distinct on extension with
subselects or some variation on group by. Both of these involve
understanding however, which row you want out when you're going to
be dropping rows.

Your step 3 above isn't well defined in general. There's no single place
in the sort order for a value of a column being distincted when the value
occurs multiple times. In the data above, col1='b' occurs twice in the
sort order and you need to give an indication of which place in the sort
order you want to use.

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Corey W. Gibbs 2003-11-05 19:25:30 Remove Unused Blobs?
Previous Message Bruno Wolff III 2003-11-05 17:53:27 Re: DISTINCT ... ORDER BY