From: | Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org |
Subject: | Re: UTF8MatchText |
Date: | 2007-05-20 07:44:54 |
Message-ID: | 464FFC76.9060308@zigo.dhs.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Tom Lane skrev:
> You could imagine trying to do
> % a byte at a time (and indeed that's what I'd been thinking it did)
> but that gets you out of sync which breaks the _ case.
It is only when you have a pattern like '%_' when this is a problem and
we could detect this and do byte by byte when it's not. Now we check (*p
== '\\') || (*p == '_') in each iteration when we scan over characters
for '%', and we could do it once and have different loops for the two cases.
Other than this part that I think can be optimized I don't see anything
wrong with the idea behind the patch. To make the '%' case fast might be
an important optimization for a lot of use cases. It's not uncommon that
'%' matches a bigger part of the string than the rest of the pattern.
It's easy to make a misstake when one is used to think about the simple
fixed size characters like ascii. Strange that this simple topic can be
so difficult to think about... :-)
/Dennis
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2007-05-20 09:26:50 | Re: Passing more context info to selectivity-estimation code |
Previous Message | NikhilS | 2007-05-20 06:34:37 | Re: CREATE TABLE LIKE INCLUDING INDEXES support |
From | Date | Subject | |
---|---|---|---|
Next Message | Henry B. Hotz | 2007-05-20 08:28:40 | Re: Preliminary GSSAPI Patches |
Previous Message | NikhilS | 2007-05-20 06:34:37 | Re: CREATE TABLE LIKE INCLUDING INDEXES support |