From: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
---|---|
To: | masm(at)fciencias(dot)unam(dot)mx |
Cc: | peter_e(at)gmx(dot)net, pgman(at)candle(dot)pha(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: regexp character class locale awareness patch |
Date: | 2002-04-16 02:42:47 |
Message-ID: | 20020416114247T.t-ishii@sra.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> According to POSIX -regex (7)-, standard character class are:
>
> alnum digit punct
> alpha graph space
> blank lower upper
> cntrl print xdigi
>
> Many of that classes are different in different locales, and currently
> all work as if the localization were C. Many of those tests have
> multibyte issues, however with the patch postgres will work for
> one-byte encondings, which is better than nothing. If someone
> (Tatsuo?) gives some advice I will work in the multibyte version.
I don't think character classes are applicable for most mutibyte
encodings. Maybe only the exeception is Unicode?
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> >
> > Basically, you manually preprocess the patch to include the
> > USE_LOCALE branch and remove the not USE_LOCALE branch.
>
> Yeah, that should work. You may also remove include/regex/cclass.h
> since it will not be used any more.
But I don't like cclass_init() routine runs every time when reg_comp
called. In my understanding the result of cclass_init() is always
same. What about running cclass_init() in postmaster, not postgres? Or
even better in initdb time?
--
Tatsuo Ishii
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Lockhart | 2002-04-16 02:45:59 | Re: [PATCHES] ANSI Compliant Inserts |
Previous Message | Thomas Lockhart | 2002-04-16 02:33:05 | Re: Bug #633: CASE statement evaluation does not short-circut |