From: | Bruno Wolff III <bruno(at)wolff(dot)to> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Warts with SELECT DISTINCT |
Date: | 2006-05-04 14:06:11 |
Message-ID: | 20060504140611.GA19321@wolff.to |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, May 04, 2006 at 02:39:33 -0400,
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Bruno Wolff III <bruno(at)wolff(dot)to> writes:
> > ... it would be OK to rewrite
> > SELECT DISTINCT x ORDER BY foo(x)
> > as
> > SELECT DISTINCT ON (foo(x), x) x ORDER BY foo(x)
>
> This assumes that x = y implies foo(x) = foo(y), which is something
> that's not necessarily the case, mainly because a datatype's "="
> function need not have a lot to do with the behavior of arbitrary
> functions foo(), especially if foo() yields a different datatype.
> The citext datatype is an easy counterexample: it thinks "foo" = "Foo",
> but md5() of those values will not yield the same answers.
>
> The bottom line here is that this sort of deduction requires more
> understanding of the properties of datatypes and functions than
> our existing catalogs allow the planner to obtain.
Thanks for pointing that out. I should have realized that this was the same
(or at least close to) issue I was thinking would be a problem initially, but
then I started thinking that '=' promised more than it did and assumed that
x = y implies foo(x) = foo(y), which as you point out isn't always true.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-05-04 14:11:46 | Re: Rethinking locking for database create/drop vs connection startup |
Previous Message | Larry Rosenman | 2006-05-04 13:28:40 | autovacuum logging, part deux. |