From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Guillaume Smet <guillaume(dot)smet(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: like/ilike improvements |
Date: | 2007-09-19 18:39:29 |
Message-ID: | 46F16CE1.7040409@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Guillaume Smet wrote:
> Andrew, All,
>
>
>> On 5/22/07, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>
>>> But before I commit this I'd appreciate seeing some more testing, both
>>> for correctness and performance.
>>>
>
> I finally found some time to test this patch on our data. As our
> production database is still using 8.1, I made my tests with 8.1.10
> and 8.3devel. As I had very weird results, I tested also 8.2.5.
>
> The patch seems to work as expected in my locale. I didn't notice
> problems during the tests I made except for the performance problem I
> describe below.
>
> The box is a recent dual core box using CentOS 5. It's a test box
> installed specifically to test PostgreSQL 8.3. Every version is
> compiled with the same compiler. Locale is fr_FR.UTF-8 and database is
> UTF-8 too.
> The table used to make the tests fits entirely in RAM.
>
> I tested a simple ILIKE query on our data with 8.3devel and it was far
> slower than with 8.1.10 (2 times slower). It was obviously not the
> expected result as it should have been faster considering your work.
> So I decided to test also with 8.2.5 and it seems a performance
> regression was introduced in 8.2 (and not in 8.3 which is in fact a
> bit faster than 8.2).
>
> I saw this item in 8.2 release notes:
> Allow ILIKE to work for multi-byte encodings (Tom)
> Internally, ILIKE now calls lower() and then uses LIKE.
> Locale-specific regular expression patterns still do not work in these
> encodings.
>
> Could it be responsible of such a slow down?
>
> I attached the results of my tests. If anyone needs more information,
> I'll be glad to provide them.
>
>
>
Ugh.
It's at least good to see that the LIKE case has some useful speedup in
8.3.
Can you run the same set of tests in a single byte encoding like latin1?
We might have to look at doing on-demand lowering, but in a case like
yours it looks like we'd still end up lowering almost every character
anyway, so I'm not quite sure what to do. Note that the 8.2 change was a
bug fix, so we can't just revert it. Maybe we need to look closely at
the efficiency of lower().
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Gregory Stark | 2007-09-19 19:02:09 | Re: curious regression failures |
Previous Message | Pedro Belmino | 2007-09-19 18:34:32 | Debugger |
From | Date | Subject | |
---|---|---|---|
Next Message | Gregory Stark | 2007-09-19 19:02:09 | Re: curious regression failures |
Previous Message | Andrew Dunstan | 2007-09-19 18:14:40 | Re: curious regression failures |