From: | Michael Fuhr <mike(at)fuhr(dot)org> |
---|---|
To: | Martin Langhoff <martin(at)catalyst(dot)net(dot)nz> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: LC_CTYPE and matching accented chars |
Date: | 2007-06-27 07:59:05 |
Message-ID: | 20070627075904.GA51122@winnie.fuhr.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, Jun 27, 2007 at 09:28:24AM +1200, Martin Langhoff wrote:
> Alvaro Herrera wrote:
> > I think it would be much easier if you did something like
> >
> > select * from test where lower(to_ascii(value)) = lower(to_ascii('martín'));
> >
> > When to_ascii doesn't work (for example because it doesn't work in UTF8)
> > you may want to use convert() to recode the text to latin1 or latin9.
>
> Well, with the example above to_ascii doesn't work.
>
> select to_ascii(value) from test ;
> ERROR: encoding conversion from UTF8 to ASCII not supported
>
> And neither does convert
>
> select convert(value using utf8_to_ascii) from test ;
> ERROR: character 0xc3 of encoding "MULE_INTERNAL" has no equivalent
> in "SQL_ASCII"
As Alvaro suggested, try converting to latin1 or latin9 and then
calling to_ascii:
select 'martin' = to_ascii(convert('martín', 'latin1'), 'latin1');
?column?
----------
t
(1 row)
For other possibilities search the list archives for examples of
"unaccent" functions that normalize text to NFD (Unicode Normalization
Form D) and remove nonspacing marks. Here's a message with a couple
of PL/Perl functions:
http://archives.postgresql.org/pgsql-general/2007-01/msg00702.php
--
Michael Fuhr
From | Date | Subject | |
---|---|---|---|
Next Message | Hannes Dorbath | 2007-06-27 08:29:27 | "Quota" |
Previous Message | Richard Huxton | 2007-06-27 07:30:43 | Re: Auto Vaccume- Time based |