From: | Steve Crawford <scrawford(at)pinpointresearch(dot)com> |
---|---|
To: | Steve Atkins <steve(at)blighty(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Email data type |
Date: | 2004-05-17 16:21:54 |
Message-ID: | 200405170921.54981.scrawford@pinpointresearch.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Monday 17 May 2004 8:45 am, Steve Atkins wrote:
> Also, a(at)10(dot)11(dot)12(dot)13 is a syntactically valid email address, in the
> .13 TLD. It does not deliver to 10.11.12.13, or anywhere else, as
> of today, unless the MTA or local recursive resolver is broken (a
> common case). a(at)[10(dot)11(dot)12(dot)13] is a whole other thing. As is
> a(at)[::10.11.12.13] and various other IPv6 variants.
Along those lines a(at)foo(dot)bar(dot)com and a(at)[foo(dot)bar(dot)com] are valid but they
don't necessarily refer to the same mailbox (depends on the mx for
foo.bar.com)
> Parsing email addresses is a significant part of my day job, and
> email address validation is a lot harder than it looks at first
> sight.
Yes, indeed.
> Don't forget quoting, whitespace, escaping and nesting
> parenthetical comments....
The "just looking for an @" breaks pretty quickly, say with
"joe@"some.dom or a myriad of other variations.
> In some contexts the empty string is a valid email address. In some
> contexts "Postmaster" is a valid email address.
As are postmaster and pOsTmaSTeR and POSTmaster and they are all the
same address.
> I'm not entirely convinced that an email address is a simple and
> well-defined enough datatype to handle comprehensively within the
> DB. The validation decisions are complex and vary from application
> to application.
>
> (I use two text columns - localpart and domainpart, with an index
> on reverse(lower(domainpart)) and leave validation to the
> application, myself).
Indeed. A problem with the "email address" datatype is that it hinders
normalization:
"Joe User" <joe(at)user(dot)dom> is valid but a database designer would
probably prefer columns for name and email, or if the addresses were
all people, firstname, middlename, lastname, email.
As you mentioned, the email can be broken into localpart and
domainpart but if the app requires it, the domainpart could be
further rendered into toplevel (so you could find all the .gov or
.edu), secondlevel (at least corresponds to a registrant) and
subdomain(s) as necessary.
Cheers,
Steve
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Atkins | 2004-05-17 16:22:20 | Re: Email data type |
Previous Message | Dave Page | 2004-05-17 16:17:59 | Re: Email data type |