| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
| Cc: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: like/ilike improvements |
| Date: | 2007-05-22 17:01:14 |
| Message-ID: | 14707.1179853274@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers pgsql-patches |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> I thought we'd determined that advancing bytewise for "%" was also
>> risky, in two cases:
>>
>> 1. Multibyte character set that is not UTF8 (more specifically, does not
>> have a guarantee that first bytes and not-first bytes are distinct)
> I thought we disposed of the idea that there was a problem with charsets
> that didn't do first byte special.
We disposed of that in connection with a version of the patch that had
"%" advancing in NextChar units, so that comparison of ordinary
characters was always safely char-aligned. Consider 2-byte characters
represented as {AB} etc:
DATA x{AB}{CD}y
PATTERN %{BC}%
If "%" advances by bytes then this will find a spurious match. The
only thing that prevents it is if "B" can't be both a leading and a
trailing byte of validly-encoded MB characters.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2007-05-22 17:29:23 | Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server |
| Previous Message | Martijn van Oosterhout | 2007-05-22 16:56:10 | Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bruce Momjian | 2007-05-22 17:15:00 | Re: Synchronized Scan |
| Previous Message | Andrew Dunstan | 2007-05-22 16:51:51 | Re: like/ilike improvements |