From: | "David Rowley" <dgrowley(at)gmail(dot)com> |
---|---|
To: | "'David Fetter'" <david(at)fetter(dot)org> |
Cc: | "'Madison Kelly'" <linux(at)alteeve(dot)com>, <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: SELECT DISTINCT ... ORDER BY problem |
Date: | 2008-12-09 01:09:38 |
Message-ID: | 983A86461DFC4AF39F91ECB8CA361AA9@amd64 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> -----Original Message-----
> From: David Fetter [mailto:david(at)fetter(dot)org]
> Sent: 09 December 2008 00:55
> To: David Rowley
> Cc: 'Madison Kelly'; pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] SELECT DISTINCT ... ORDER BY problem
>
> On Mon, Dec 08, 2008 at 11:16:29PM -0000, David Rowley wrote:
> > > -----Original Message-----
> > > From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-
> > > owner(at)postgresql(dot)org] On Behalf Of Madison Kelly
> > > Sent: 08 December 2008 22:19
> > > To: pgsql-general(at)postgresql(dot)org
> > > Subject: [GENERAL] SELECT DISTINCT ... ORDER BY problem
> > >
> > > Hi all,
> > >
> > > I've got a table that I am trying to SELECT DISTINCT on one column
> > > and ORDER BY on a second column, but am getting the error:
> > >
> > > SELECT DISTINCT ON expressions must match initial ORDER BY expressions
> > >
> > > I can't add the second column to the DISTINCT clause because every
> > > row is unique. Likewise, I can't add the first column to my ORDER BY
> as
> > > it'd not sort the way I need it to.
> > >
> > > Here is a simplified version of my query:
> > >
> > > \d table
> > > Table "table"
> > > Column | Type | Modifiers
> > >
> > > -----------------+---------+------------------------------------------
> ----
> > > --
> > > tbl_id | integer | not null default
> > > nextval('tbl_seq'::regclass)
> > > foo | text |
> > > bar | text |
> > >
> > > SELECT DISTINCT ON (foo) foo, bar FROM table WHERE bar < '2008-12-07
> > > 16:32:46' AND tbl_id=153 ORDER BY bar LIMIT 1;
> > >
> >
> > To make the query valid you would have to ORDER BY foo,bar
> > DISTINCT ON in this case is only going to show the first bar value for
> each
> > foo.
> >
> > Is tbl_id not your PK and only giving 1 row anyway?
> >
> > >
> > > I understand from:
> > >
> > > http://archives.postgresql.org/pgsql-sql/2007-02/msg00169.php
> > >
> > > That this is not really possible because the any given 'foo' column
> > > could match multiple 'bar' columns, so what do you search by? However,
> > > it's made some sort of decision as a value is shown in 'bar' for each
> > > 'foo'.
> > >
> > > So my question is two-fold:
> > >
> > > 1. Can I not say, somehow, "sort all results by 'bar', and return the
> > > first/last 'bar' for each distinct 'foo'?
> > >
> > > 2. Can I somehow say "Order the results using the value of 'bar' you
> > > return, regardless of where it came from"?
> >
> > You can nest queries:
> >
> > SELECT foo,bar
> > FROM (SELECT DISTINCT ON (foo) foo,
> > Bar
> > FROM table
> > WHERE bar < '2008-12-07 16:32:46'
> > AND tbl_id=153 ORDER BY foo,bar
> > ) AS t ORDER BY bar;
> >
> > Notice that I'm only applying the final order by in the outer query.
>
> When we get windowing functions, a lot of this pain will go away :)
>
Yes! Hope it won't be too long now. The patch seems to behave like it should
now :)
Hopefully we'll see it commited for 8.4.
Though this does not look too much cleaner at least it's standard SQL:
A preview for Madi:
SELECT foo,bar
FROM (SELECT foo,bar,
ROW_NUMBER() OVER (PARTITION BY foo ORDER BY bar) AS pos
FROM table
) AS t
WHERE pos = 1
ORDER BY bar;
Probably easier to understand what's going on in this one.
David.
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2008-12-09 01:15:10 | Re: SELECT DISTINCT ... ORDER BY problem |
Previous Message | Madison Kelly | 2008-12-09 00:59:05 | Re: SELECT DISTINCT ... ORDER BY problem |