From: | Darren Ferguson <darren(at)crystalballinc(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Case Insensitive Data Type |
Date: | 2002-05-24 19:03:06 |
Message-ID: | Pine.LNX.4.10.10205241500460.26467-100000@thread.crystalballinc.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Tom
Just out of interest and i am not sure if this is possible or practical
but:
Could / Would it be very difficult to add ON SELECT triggers to Postgres?
This could help a little.
It would be as a last resort.
SELECT email FROM foo WHERE email = 'ABCDE';
TRIGGER could have a part where you could then lower the ABCDE before the
query was run.
Just a thought
Darren Ferguson
On Fri, 24 May 2002, Tom Lane wrote:
> "Russell Black" <russell(dot)black(at)iarchives(dot)com> writes:
> > I'm using email addresses as a primary key in one of my tables. Currently,=
> > I have to ensure that the email addresses are converted to lower case befo=
> > re they get put into the table, and that all lookups on that field are conv=
> > erted to lower case before the select statement, in order to ensure that Jo=
> > e(at)Somewhere(dot)com is the same as joe(at)somewhere(dot)com(dot)
>
> A partial answer to this is
>
> CREATE UNIQUE INDEX foo_key on foo (lower(email));
>
> which will prevent duplicate entries with the same-up-to-case-folding
> address. However you'd still have to query with queries like
>
> SELECT * FROM foo WHERE lower(email) = lower('probe value');
>
> if the probe values you are handed can't be relied on to be downcased
> already. (I tried to think of a way to insert the lower() automatically
> using a rule, but couldn't come up with anything --- ideas anyone?)
>
>
> > Does anyone know of a case-insensitive data type?
>
> I do not think you should go away feeling that a case-insensitive data
> type would magically solve your problems. Today you might think that
> case-folding is what you want, but by no later than next week you would
> figure out that there are other sorts of normalizations you'll also need
> to do on provided addresses --- one obvious example is stripping leading
> and trailing blanks and reducing consecutive blanks to a single blank.
> In fact if you really want to look like you know what you're doing,
> you'll have to strip out the comment portions of an address entirely.
> For example, these are all equivalent forms per RFC specs:
> joe(at)blow(dot)com
> Joe Blow <joe(at)blow(dot)com>
> "Joe Blow" <joe(at)blow(dot)com>
> joe(at)blow(dot)com (Joe Blow)
> and should be recognized as such by anything that pretends to know
> what email addresses are. (For that matter, you do know that the
> specs say only the hostname part should be assumed case-insensitive,
> don't you?)
>
> So the real bottom line here is that you'd better figure a way to pass
> the input strings through an address-normalization function. "lower()"
> might do as a first-order approximation but you won't stay with it
> forever.
>
> > Can I create a custom data type to do this?
>
> You probably could, but it'd likely be more work than you want. A
> minimal implementation would require an input function (which would
> apply the normalization rules you want), an output function (for which
> you could probably just commandeer textout), and a no-op text() coercion
> function (assuming you'd like to do anything besides display the value).
> Then you'd have to create a set of index operators and access method
> entries, if you intended to make this indexable --- these could all
> piggyback on text operators, but you'd still need to do the work of
> making the catalog entries for them.
>
> Assuming you did have all that, I think a query like
>
> select * from foo where email = 'Joe Blow <joe(at)blow(dot)com>'
>
> would indeed work the way you want --- the initially untyped string
> literal would eventually get coerced to your datatype, and then your
> input conversion routine could do the right things to it. But it'll
> likely be easier to make it happen on the client side ;-)
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
From | Date | Subject | |
---|---|---|---|
Next Message | Doug Fields | 2002-05-24 19:04:05 | Re: Altering existing table to be WITHOUT OIDs |
Previous Message | Gregory Seidman | 2002-05-24 19:02:08 | Re: Question on crypt password |