From: | Michael Fuhr <mike(at)fuhr(dot)org> |
---|---|
To: | Andrus <eetasoft(at)online(dot)ee> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Select all invalid e-mail addresses |
Date: | 2005-10-21 18:49:09 |
Message-ID: | 20051021184909.GA59288@winnie.fuhr.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, Oct 21, 2005 at 08:15:23PM +0300, Andrus wrote:
> I tried
>
> SELECT email FROM customer
> WHERE email !~
> '/^[^(at)]*@(?:[^(at)]*\(dot))?[a-z0-9-_]+\.(?:a[defgilmnoqrstuwz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvxyz]|d[ejkmoz]|e[ceghrst]|f[ijkmorx]|g[abdefhilmnpqrstuwy]|h[kmnrtu]|i[delnoqrst]|j[mop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrtwy]|qa|r[eouw]|s[abcdeghijklmnortvyz]|t[cdfghjkmnoprtvwz]|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]|edu|com|net|org|gov|mil|info|biz|coop|museum|aero|name|pro)$/'
>
> but got an error
>
> ERROR: invalid regular expression: invalid character range
Aside from the fact that this regular expression is semantically wrong,
it has a few other problems:
* A hyphen (-) must come first or last in a character class if you want
it interpreted literally instead of as part of a range specification.
test=> SELECT 'abc' ~ '[a-z0-9-_]'; -- WRONG
ERROR: invalid regular expression: invalid character range
test=> SELECT 'abc' ~ '[a-z0-9_-]';
?column?
----------
t
(1 row)
* Regular expressions in PostgreSQL don't use delimiters like / at
the beginning and end of the expression.
test=> SELECT 'abc' ~ '/abc/'; -- WRONG
?column?
----------
f
(1 row)
test=> SELECT 'abc' ~ 'abc';
?column?
----------
t
(1 row)
* If you use single quotes around the regular expression then you
need to escape backslashes that should be part of the regular
expression; otherwise the backslash will be parsed by the string
parser before the string is used as a regular expression and you'll
get unexpected results. In other words, there's an extra layer of
string parsing that you have to allow for. In 8.0 and later you
can avoid this by using dollar quotes.
test=> SELECT 'abc' ~ 'a\.c'; -- WRONG
?column?
----------
t
(1 row)
test=> SELECT 'abc' ~ 'a\\.c';
?column?
----------
f
(1 row)
test=> SELECT 'a.c' ~ 'a\\.c';
?column?
----------
t
(1 row)
test=> SELECT 'abc' ~ $$a\.c$$;
?column?
----------
f
(1 row)
test=> SELECT 'a.c' ~ $$a\.c$$;
?column?
----------
t
(1 row)
--
Michael Fuhr
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Atkins | 2005-10-21 18:49:54 | Re: Select all invalid e-mail addresses |
Previous Message | Scott Marlowe | 2005-10-21 18:38:13 | Re: scheduled backup |