From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Greg Stark <gsstark(at)mit(dot)edu> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: MAX/MIN optimization via rewrite (plus query rewrites generally) |
Date: | 2004-11-11 22:46:17 |
Message-ID: | 12475.1100213177@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Greg Stark <gsstark(at)mit(dot)edu> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> Oh? How is a first() aggregate going to know what sort order you want
>> within the group?
> It would look something like
> select x,first(a),first(b) from (select x,a,b from table order by x,y) group by x
> which is equivalent to
> select DISTINCT ON (x) x,a,b from table ORDER BY x,y
No, it is not. The GROUP BY has no commitment to preserve order ---
consider for example the possibility that we implement the GROUP BY by
hashing.
> The group by can see that the subquery is already sorted by x and
> doesn't need to be resorted. In fact I believe you added the smarts to
> detect that condition in response to a user asking about precisely
> this type of scenario.
The fact that an optimization is present does not make it part of the
guaranteed semantics of the language.
Basically, first() is a broken concept in SQL. Of course DISTINCT ON
is broken too for the same reasons, but I do not see that first() is
one whit less of a kluge than DISTINCT ON.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Patrick B Kelly | 2004-11-11 23:03:58 | Re: multiline CSV fields |
Previous Message | Greg Stark | 2004-11-11 22:34:43 | Re: MAX/MIN optimization via rewrite (plus query rewrites generally) |