Re: Unicode restriction

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: olly(at)lfix(dot)co(dot)uk, pgsql-hackers(at)postgresql(dot)org, 232217(at)bugs(dot)debian(dot)org
Subject: Re: Unicode restriction
Date: 2004-08-03 15:15:34
Message-ID: 23200.1091546134@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> writes:
> Before 7.4, to be handled by regex routines, UTF-8 are converted to
> ISO 10646. There was a limitaion in regex routines in that they cannot
> handle multibyte characters > 2bytes. In another word only 16bit UCS-2
> are supported. That's why ISO 10646 > 0x10000 is rejected.

> I'm not sure if the regex routines include in 7.4 or later has this
> restrictions or not. If not, probably we could remove the check (with
> losing data compatibilty).

It looks to me like the regex routines now use pg_wchar, so I don't
think we need the restriction any longer.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-08-03 15:18:27 Re: Open items
Previous Message Tom Lane 2004-08-03 14:17:14 Re: Anybody have an Oracle PL/SQL reference at hand?