Quick Links

Re: Unicode normalization

From:	Andreas Kalsch <andreaskalsch(at)gmx(dot)de>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Unicode normalization
Date:	2009-09-17 12:58:39
Message-ID:	4AB2327F.10707@gmx.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

My standard encoding is UTF-8 on all levels so I don't need this
high-cost call:

plpy.execute("select setting from pg_settings where name =
'server_encoding'");

Additionally I want to get the original cases.

For this purpose my solution is still fitting to my need. But it is not
the one you have cited below, but:

CREATE OR REPLACE FUNCTION simplify (str text)
RETURNS text
AS $$
import unicodedata

s = unicodedata.normalize('NFKD', str.decode('UTF-8'))
s = ''.join(c for c in s if unicodedata.combining(c) == 0)
return s.encode('UTF-8')
$$ LANGUAGE plpythonu;

Andi

>> 2) Transfering this to PL/Python:
>>
>> CREATE OR REPLACE FUNCTION test (str text)
>> RETURNS text
>> AS $$
>> import unicodedata
>> return unicodedata.normalize('NFKD', str.decode('UTF-8'))
>> $$ LANGUAGE plpythonu;

In response to

Re: Unicode normalization at 2009-09-17 04:01:57 from Alvaro Herrera

Browse pgsql-general by date

	From	Date	Subject
Next Message	Marco Fortina	2009-09-17 14:06:32	NAS
Previous Message	Sam Mason	2009-09-17 10:14:20	Re: Unicode normalization