From: | wade <wade(at)wavefire(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: POSIX regex performance bug in 7.3 Vs. 7.2 |
Date: | 2003-02-04 16:24:47 |
Message-ID: | 3.0.32.20030204082447.020a2aa0@mail.wavefire.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
OK,
I redid my trials with the same data set on 7.2.3 --with-multibyte and I
get the same brutal performance hit, so it is definitely a
multibyte-specific problem.
WRT the distribution of the data in the table, I used the following:
All g-words in /usr/share/dict with different processes attached:
no process
init caps.
word || row_id
etc...
There are only about 1000 words that appear more than once (2 or 3 times)
in 27k rows.
-Wade Klaver
At 11:08 PM 2/3/03 -0500, Tom Lane wrote:
>Next question: may I guess that you weren't using MULTIBYTE in 7.2?
>
>After still more digging, I'm coming round to the opinion that the
>problem is that MULTIBYTE is forced on in 7.3, and this imposes a
>factor-of-256 overhead in a bunch of the operations in regcomp.c.
>In particular, compiling a case-independent regex is now hugely
>more expensive than it used to be.
>
>The parties who wanted to force MULTIBYTE on promised that there
>would be no such penalties :-(
>
> regards, tom lane
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Neil Conway | 2003-02-04 16:46:13 | Re: POSIX regex performance bug in 7.3 Vs. 7.2 |
Previous Message | Damjan Pipan | 2003-02-04 16:00:16 | [GENERAL] HELP NEEDED: Recreating DROP columns |