| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | "David Rowley" <dgrowley(at)gmail(dot)com> |
| Cc: | "'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "'Peter Eisentraut'" <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: [PATCHES] TODO item: Implement Boyer-Moore searching (First time hacker) |
| Date: | 2008-09-07 04:26:47 |
| Message-ID: | 6183.1220761607@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers pgsql-patches |
"David Rowley" <dgrowley(at)gmail(dot)com> writes:
> I've made the discussed changes. Also updated the benchmark results.
> http://www.unixbeast.com/~fat/8.3_test_v1.3.xls
Applied with revisions; mostly cosmetic except for one point. I
realized after studying the code a bit more that B-M cannot possibly win
for a single-character pattern (needle), since the skip distance must
always be 1 in that case. The fact that it seemed to keep up at that
length has to be because the original coding included a strncmp call
inside the innermost loop, which likely prevents the compiler from
optimizing that loop really tightly. But the strncmp wasn't doing
anything anyway for the case of pattern length = 1. So what I committed
special-cases pattern length 1 to be a naive search with a *very* tight
inner loop. I think it's worth troubling over this case because a
common usage is split_to_array and suchlike with single-character
delimiters.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2008-09-07 07:39:39 | Re: Noisy CVS updates |
| Previous Message | Alex Hunsaker | 2008-09-07 02:24:33 | Re: hash index improving v3 |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jaime Casanova | 2008-09-07 05:46:40 | Re: [PgFoundry] Unsigned Data Types [1 of 2] |
| Previous Message | Alex Hunsaker | 2008-09-07 02:24:33 | Re: hash index improving v3 |