Re: Case Insensitive Data Type

From: Darren Ferguson <darren(at)crystalballinc(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Case Insensitive Data Type
Date: 2002-05-24 19:03:06
Message-ID: Pine.LNX.4.10.10205241500460.26467-100000@thread.crystalballinc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom

Just out of interest and i am not sure if this is possible or practical
but:

Could / Would it be very difficult to add ON SELECT triggers to Postgres?

This could help a little.

It would be as a last resort.

SELECT email FROM foo WHERE email = 'ABCDE';

TRIGGER could have a part where you could then lower the ABCDE before the
query was run.

Just a thought

Darren Ferguson

On Fri, 24 May 2002, Tom Lane wrote:

> "Russell Black" <russell(dot)black(at)iarchives(dot)com> writes:
> > I'm using email addresses as a primary key in one of my tables. Currently,=
> > I have to ensure that the email addresses are converted to lower case befo=
> > re they get put into the table, and that all lookups on that field are conv=
> > erted to lower case before the select statement, in order to ensure that Jo=
> > e(at)Somewhere(dot)com is the same as joe(at)somewhere(dot)com(dot)
>
> A partial answer to this is
>
> CREATE UNIQUE INDEX foo_key on foo (lower(email));
>
> which will prevent duplicate entries with the same-up-to-case-folding
> address. However you'd still have to query with queries like
>
> SELECT * FROM foo WHERE lower(email) = lower('probe value');
>
> if the probe values you are handed can't be relied on to be downcased
> already. (I tried to think of a way to insert the lower() automatically
> using a rule, but couldn't come up with anything --- ideas anyone?)
>
>
> > Does anyone know of a case-insensitive data type?
>
> I do not think you should go away feeling that a case-insensitive data
> type would magically solve your problems. Today you might think that
> case-folding is what you want, but by no later than next week you would
> figure out that there are other sorts of normalizations you'll also need
> to do on provided addresses --- one obvious example is stripping leading
> and trailing blanks and reducing consecutive blanks to a single blank.
> In fact if you really want to look like you know what you're doing,
> you'll have to strip out the comment portions of an address entirely.
> For example, these are all equivalent forms per RFC specs:
> joe(at)blow(dot)com
> Joe Blow <joe(at)blow(dot)com>
> "Joe Blow" <joe(at)blow(dot)com>
> joe(at)blow(dot)com (Joe Blow)
> and should be recognized as such by anything that pretends to know
> what email addresses are. (For that matter, you do know that the
> specs say only the hostname part should be assumed case-insensitive,
> don't you?)
>
> So the real bottom line here is that you'd better figure a way to pass
> the input strings through an address-normalization function. "lower()"
> might do as a first-order approximation but you won't stay with it
> forever.
>
> > Can I create a custom data type to do this?
>
> You probably could, but it'd likely be more work than you want. A
> minimal implementation would require an input function (which would
> apply the normalization rules you want), an output function (for which
> you could probably just commandeer textout), and a no-op text() coercion
> function (assuming you'd like to do anything besides display the value).
> Then you'd have to create a set of index operators and access method
> entries, if you intended to make this indexable --- these could all
> piggyback on text operators, but you'd still need to do the work of
> making the catalog entries for them.
>
> Assuming you did have all that, I think a query like
>
> select * from foo where email = 'Joe Blow <joe(at)blow(dot)com>'
>
> would indeed work the way you want --- the initially untyped string
> literal would eventually get coerced to your datatype, and then your
> input conversion routine could do the right things to it. But it'll
> likely be easier to make it happen on the client side ;-)
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Doug Fields 2002-05-24 19:04:05 Re: Altering existing table to be WITHOUT OIDs
Previous Message Gregory Seidman 2002-05-24 19:02:08 Re: Question on crypt password