From: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | "Andrew - Supernews" <andrew(at)supernews(dot)net>, pgsql-patches(at)postgresql(dot)org |
Subject: | UTF8MatchText |
Date: | 2007-04-02 04:56:04 |
Message-ID: | 20070402133445.DDF8.ITAGAKI.TAKAHIRO@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
"Andrew - Supernews" <andrew(at)supernews(dot)net> wrote:
> ITAGAKI> I think all "safe ASCII-supersets" encodings are comparable
> ITAGAKI> by bytes, not only UTF-8.
>
> This is false, particularly for EUC.
Umm, I see. I updated the optimization to be used only for UTF8 case.
I also added some inlining hints that are useful on my machine (Pentium 4).
x1000 of LIKE '%foo% on 10000 rows tables [ms]
encoding | HEAD | P1 | P2 | P3
-----------+-------+-------+-------+-------
SQL_ASCII | 7094 | 7120 | 7063 | 7031
LATIN1 | 7083 | 7130 | 7057 | 7031
UTF8 | 17974 | 10859 | 10839 | 9682
EUC_JP | 17032 | 17557 | 17599 | 15240
- P1: UTF8MatchText()
- P2: P1 + __inline__ GenericMatchText()
- P3: P2 + __inline__ wchareq()
(The attached patch is P3.)
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
utf8matchtext.patch | application/octet-stream | 17.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-04-02 05:08:01 | Re: Bug in UTF8-Validation Code? |
Previous Message | Tatsuo Ishii | 2007-04-02 04:49:58 | Re: Bug in UTF8-Validation Code? |
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2007-04-02 08:27:18 | Re: Current enums patch |
Previous Message | Tom Lane | 2007-04-02 04:11:08 | Re: Macros for typtype (was Re: Arrays of Complex Types) |