From: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | pgsql-patches(at)postgresql(dot)org, andrew+nonews(at)supernews(dot)com |
Subject: | Multibyte LIKE optimization |
Date: | 2007-03-30 08:40:08 |
Message-ID: | 20070330142456.77F9.ITAGAKI.TAKAHIRO@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Andrew - Supernews <andrew+nonews(at)supernews(dot)com> wrote:
> Actually, I think your proposal is fundamentally correct, merely incomplete.
Yeah, I fixed the patch to handle '_' correctly.
> Doing octet-based rather than character-based matching of strings is a
> _design goal_ of UTF8.
I think all "safe ASCII-supersets" encodings are comparable by bytes,
not only UTF-8. Their all multibyte characters consist of bytes larger
than 127. I updated the patch on this presupposition. It uses octet-based
matching usually and character-based matching at '_'.
There was 30%+ of performance win in selection using multibytes LIKE '%foo%'.
encoding | HEAD | patched
-----------+---------+---------
SQL_ASCII | 7094ms | 7062ms
LATIN1 | 7083ms | 7078ms
UTF8 | 17974ms | 11635ms (64.7%)
EUC_JP | 17032ms | 12109ms (71.1%)
If this patch is acceptable, please drop JOHAB encoding from server encodings
before it is applied. Trailing bytes of JOHAB can be less than 128.
http://archives.postgresql.org/pgsql-hackers/2007-03/msg01475.php
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
mbtextmatch.patch | application/octet-stream | 6.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | ITAGAKI Takahiro | 2007-03-30 08:59:56 | Dead Space Map version 3 (simplified) |
Previous Message | Zeugswetter Andreas ADI SD | 2007-03-30 08:22:23 | Re: [PATCHES] Full page writes improvement, code update |
From | Date | Subject | |
---|---|---|---|
Next Message | ITAGAKI Takahiro | 2007-03-30 08:59:56 | Dead Space Map version 3 (simplified) |
Previous Message | Zeugswetter Andreas ADI SD | 2007-03-30 08:22:23 | Re: [PATCHES] Full page writes improvement, code update |