From: | Alexander Farber <alexander(dot)farber(at)gmail(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Re: Matching uppercased russian words (\x0410-\x042F) in UTF8 database 8.4.13 |
Date: | 2013-03-22 14:23:42 |
Message-ID: | CAADeyWgrVpRsG6baR1_oPGJWRNN3bWiCNofKKG0LTf3PAiH3Tw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
unfortunately octal doesn't seem to work either -
On Tue, Mar 19, 2013 at 7:03 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alexander Farber <alexander(dot)farber(at)gmail(dot)com> writes:
>> # select 'АБВГД' ~ '^[\u0410-\u042F]{2,}$';
>> WARNING: nonstandard use of escape in a string literal
>
> I think Unicode escapes were introduced in 9.0. In 8.4 you'd probably
> have to write out the UTF8 equivalent as octal escapes :-(
# select 'АБВГД' ~ '^[\2020-\2057]{2,}$';
WARNING: nonstandard use of escape in a string literal
LINE 1: select 'АБВГД' ~ '^[\2020-\2057]{2,}$';
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
ERROR: invalid byte sequence for encoding "UTF8": 0x82
HINT: This error can also happen if the byte sequence does not
match the encoding expected by the server, which is controlled by
"client_encoding".
But writing out UTF8 equivalents seems to work
(trying to detect capitalized Russian letters as per
http://www.unicode.org/charts/PDF/U0400.pdf ):
# select 'АБВГД' ~ '^[А-Я]{2,}$';
?column?
----------
t
(1 row)
And then I try to solve my 2nd problem (detecting 3
letters in a row, a rare case in Russian language):
# select 'ОШИБББКА' ~ '(.)\1\1';
WARNING: nonstandard use of escape in a string literal
LINE 1: select 'ОШИБББКА' ~ '(.)\1\1';
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
?column?
----------
f
(1 row)
Does anybody please know why this fails in 8.4.13?
According to the table 9-18 in
http://www.postgresql.org/docs/8.4/static/functions-matching.html
it should be ok to use \1 for referencing
parts captured by round brackets?
Regards
Alex
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Janin | 2013-03-22 14:50:39 | Re: Rewritten rows on unchanged values |
Previous Message | Hannes Erven | 2013-03-22 14:15:59 | Re: Rewritten rows on unchanged values |