Re: [HACKERS] DISTINCT and ORDER BY bug?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Don Baccus <dhogaza(at)pacifier(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] DISTINCT and ORDER BY bug?
Date: 2000-02-07 17:10:39
Message-ID: 20101.949943439@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Don Baccus <dhogaza(at)pacifier(dot)com> writes:
SQL> select distinct x,y+1 from foo order by x+y+1
> *
> ERROR at line 1:
> ORA-01791: not a SELECTed expression

Actually, that was a little unfair, since their parser no doubt parsed
"x+y+1" as (x+y)+1, leaving no common subexpression visible. Do they
accept

select distinct x,y+1 from foo order by x+(y+1)

>>> At least, the rule is simple if you can compare expression trees.

>> I think we have something pretty similar for GROUP BY, actually,
>> so it may not be hard to make this work.

On further thought, I think the real implementation issue is that
doing SELECT DISTINCT ORDER BY requires either two sorting steps
(sort by DISTINCT fields, "uniq" filter, sort again by ORDER BY fields)
or else some very hairy logic to figure out that ORDER BY x+1
"implies" ORDER BY x. In fact I'm not sure it does imply it
in the general case. In your original example, the requested sort
was ORDER BY upper(x), but that doesn't guarantee that the tuples
will be ordered adequately for duplicate-x elimination. For example,
that ORDER BY might yield

Ansel Adams
Don Baccus
DON BACCUS
Don Baccus
Joe Blow
...

which is a valid sort by upper(x), but a uniq filter on plain x
will fail to get rid of the second occurrence of "Don Baccus" as
it should.

Possibly we could make this work by implicitly expanding the ORDER BY
to "ORDER BY upper(x), x" which would ensure that the duplicate x's
are brought together. I am not sure this will give the right results
always, but it seems promising. We are assuming here that upper(x)
gives equal outputs for equal inputs, so it would fall down on random(x)
--- I suppose we could refuse to do this if we see a function that is
marked non-constant-foldable in pg_proc...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2000-02-07 17:11:56 Re: [HACKERS] New Globe
Previous Message Marten Feldtmann 2000-02-07 17:06:18 Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL