From: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Regexps vs. locale |
Date: | 2008-12-08 17:39:31 |
Message-ID: | 87vdtuo9bg.fsf@news-spur.riddles.org.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
>> Obviously, this happens because the locale support functions in
>> backend/regex/regc_locale.c are (presumably intentionally)
>> crippled so as not to support non-ascii chars, despite all the
>> code there using wide chars for everything otherwise.
Tom> It's not so much intentional as that no one has gotten around to
Tom> making it work. The difficulty is that the wide-char codes we
Tom> are using might not match what the <wctype.h> functions expect,
Tom> and it's unclear what we could do to fix that.
Couldn't we follow the example of lower(), and convert the string to
wchar_t using mbstowcs (rather than pg_wchar_t and pg_mb2wchar)?
This obviously requires that we have a matching lc_ctype for the
encoding, but we insist on that now anyway, no?
--
Andrew.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-12-08 18:15:28 | Re: cvs head initdb hangs on unixware |
Previous Message | Robert Haas | 2008-12-08 17:02:14 | Re: benchmarking the query planner (was Re: Simple postgresql.conf wizard) |