| From: | Michael Fuhr <mike(at)fuhr(dot)org> | 
|---|---|
| To: | Andrus <eetasoft(at)online(dot)ee> | 
| Cc: | pgsql-general(at)postgresql(dot)org | 
| Subject: | Re: Select all invalid e-mail addresses | 
| Date: | 2005-10-21 18:49:09 | 
| Message-ID: | 20051021184909.GA59288@winnie.fuhr.org | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
On Fri, Oct 21, 2005 at 08:15:23PM +0300, Andrus wrote:
> I tried
> 
> SELECT   email  FROM customer
>   WHERE  email !~
> '/^[^(at)]*@(?:[^(at)]*\(dot))?[a-z0-9-_]+\.(?:a[defgilmnoqrstuwz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvxyz]|d[ejkmoz]|e[ceghrst]|f[ijkmorx]|g[abdefhilmnpqrstuwy]|h[kmnrtu]|i[delnoqrst]|j[mop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrtwy]|qa|r[eouw]|s[abcdeghijklmnortvyz]|t[cdfghjkmnoprtvwz]|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]|edu|com|net|org|gov|mil|info|biz|coop|museum|aero|name|pro)$/'
> 
> but got an error
> 
> ERROR:  invalid regular expression: invalid character range
Aside from the fact that this regular expression is semantically wrong,
it has a few other problems:
* A hyphen (-) must come first or last in a character class if you want
it interpreted literally instead of as part of a range specification.
  test=> SELECT 'abc' ~ '[a-z0-9-_]';  -- WRONG
  ERROR:  invalid regular expression: invalid character range
  
  test=> SELECT 'abc' ~ '[a-z0-9_-]';
   ?column? 
  ----------
   t
  (1 row)
* Regular expressions in PostgreSQL don't use delimiters like / at
the beginning and end of the expression.
  test=> SELECT 'abc' ~ '/abc/';  -- WRONG
   ?column? 
  ----------
   f
  (1 row)
  
  test=> SELECT 'abc' ~ 'abc';
   ?column? 
  ----------
   t
  (1 row)
* If you use single quotes around the regular expression then you
need to escape backslashes that should be part of the regular
expression; otherwise the backslash will be parsed by the string
parser before the string is used as a regular expression and you'll
get unexpected results.  In other words, there's an extra layer of
string parsing that you have to allow for.  In 8.0 and later you
can avoid this by using dollar quotes.
  test=> SELECT 'abc' ~ 'a\.c';  -- WRONG
   ?column? 
  ----------
   t
  (1 row)
  
  test=> SELECT 'abc' ~ 'a\\.c';
   ?column? 
  ----------
   f
  (1 row)
  
  test=> SELECT 'a.c' ~ 'a\\.c';
   ?column? 
  ----------
   t
  (1 row)
  test=> SELECT 'abc' ~ $$a\.c$$;
   ?column? 
  ----------
   f
  (1 row)
  
  test=> SELECT 'a.c' ~ $$a\.c$$;
   ?column? 
  ----------
   t
  (1 row)
-- 
Michael Fuhr
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Steve Atkins | 2005-10-21 18:49:54 | Re: Select all invalid e-mail addresses | 
| Previous Message | Scott Marlowe | 2005-10-21 18:38:13 | Re: scheduled backup |