From: | Hannu Krosing <hannu(at)tm(dot)ee> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jon Jensen <jon(at)endpoint(dot)com>, Neil Conway <neilc(at)samurai(dot)com>, wade <wade(at)wavefire(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POSIX regex performance bug in 7.3 Vs. 7.2 |
Date: | 2003-02-04 21:03:32 |
Message-ID: | 1044392612.19416.57.camel@huli |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 2003-02-04 at 18:21, Tom Lane wrote:
> 4. pcre looks like it's probably *not* as well suited to a multibyte
> environment. In particular, I doubt that its UTF8 compile option was
> even turned on for the performance comparison Neil cited --- and the man
> page only promises "experimental, incomplete support for UTF-8 encoded
> strings". The Tcl code by contrast is used only in a multibyte
> environment, so that's the supported, optimized path. It doesn't even
> assume null-terminated strings (yay).
If we are going into code-lifting business, we should also consider
Pythons sre (a modified pcre, that works both on 8-bit and python's
unicode (either 16 or 32 byte chars, depending on compile options))
It has no specific support for "raw" utf-8 or other variable-width
encodings.
--
Hannu Krosing <hannu(at)tm(dot)ee>
From | Date | Subject | |
---|---|---|---|
Next Message | Emmanuel Charpentier | 2003-02-04 21:23:25 | Re: [Fwd: Backporting parts of databases from a 7.3 server |
Previous Message | Greg Copeland | 2003-02-04 20:04:01 | Re: PGP signing releases |