Re: Unicode and unaccent()

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Unicode and unaccent()
Date: 2005-05-06 09:15:17
Message-ID: 20050506111123.1377220@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Mark Borins wrote:

> The encoding on my DB is Unicode, so far I have found an unaccent() function
> by looking in the mail archives it looks like the following:
>
>
> CREATE FUNCTION unaccent(text) RETURNS text AS $$
> BEGIN
> RETURN translate($1, '\342\347\350\351\352\364\373', 'aceeeou')
> ; END; $$ LANGUAGE plpgsql IMMUTABLE STRICT;
>
> My problem is that the values like \342 are for LATIN1 type encoding. I

Why wouldn't this:
RETURN translate($1, 'éçàêè...', 'ecaee...') ;
work just fine? It's even portable across encodings.

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

In response to

Browse pgsql-general by date

  From Date Subject
Next Message CSN 2005-05-06 09:50:22 plphp1.1 make fails
Previous Message Peter Wilson 2005-05-06 08:36:46 Re: Slony v. DBMirror