| From: | Bruno Wolff III <bruno(at)wolff(dot)to> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Warts with SELECT DISTINCT |
| Date: | 2006-05-04 14:06:11 |
| Message-ID: | 20060504140611.GA19321@wolff.to |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, May 04, 2006 at 02:39:33 -0400,
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Bruno Wolff III <bruno(at)wolff(dot)to> writes:
> > ... it would be OK to rewrite
> > SELECT DISTINCT x ORDER BY foo(x)
> > as
> > SELECT DISTINCT ON (foo(x), x) x ORDER BY foo(x)
>
> This assumes that x = y implies foo(x) = foo(y), which is something
> that's not necessarily the case, mainly because a datatype's "="
> function need not have a lot to do with the behavior of arbitrary
> functions foo(), especially if foo() yields a different datatype.
> The citext datatype is an easy counterexample: it thinks "foo" = "Foo",
> but md5() of those values will not yield the same answers.
>
> The bottom line here is that this sort of deduction requires more
> understanding of the properties of datatypes and functions than
> our existing catalogs allow the planner to obtain.
Thanks for pointing that out. I should have realized that this was the same
(or at least close to) issue I was thinking would be a problem initially, but
then I started thinking that '=' promised more than it did and assumed that
x = y implies foo(x) = foo(y), which as you point out isn't always true.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2006-05-04 14:11:46 | Re: Rethinking locking for database create/drop vs connection startup |
| Previous Message | Larry Rosenman | 2006-05-04 13:28:40 | autovacuum logging, part deux. |