From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Søren Vainio <sva(at)Netpointers(dot)com> |
Cc: | "'Andreas Joseph Krogh'" <andreak(at)officenet(dot)no>, "'pgsql-sql(at)postgresql(dot)org'" <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: Scadinavian characters in regular expressions |
Date: | 2002-04-09 13:33:55 |
Message-ID: | 28561.1018359235@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
=?iso-8859-1?Q?S=F8ren_Vainio?= <sva(at)Netpointers(dot)com> writes:
> Using \s does produce FALSE for SELECT 'one two three' ~
> '^[^\s]+[\s][^\s]+$';
> But it also produces FALSE for any two-word string ex:
> SELECT 'one two' ~ '^[^\s]+[\s][^\s]+$'; where I would expect TRUE???
> (I am using PostgreSQL 7.1.3)
I do not believe that Postgres' regular expression engine recognizes \s
as meaning anything except "s". See
http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/functions-matching.html
In the above, it's even worse: the backslashes were eaten by the
string-literal parser, so what arrived at the RE engine was just
^[^s]+[s][^s]+$ ... not likely to produce what you wanted.
As for the original issue, I wonder whether you are storing the string
as UTF-8 or Latin1 encoding. I have a suspicion that the (å
å a-ring) is actually a multibyte sequence inside the database
and for some reason Postgres isn't configured to recognize it as a
single logical character.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Søren Vainio | 2002-04-09 14:21:43 | Re: Scadinavian characters in regular expressions |
Previous Message | Gautham S. Rao | 2002-04-09 13:04:29 | Hierarchical Queries |