Quick Links

Re: strpos() && KMP

From:	Pavel Ajtkulov <ajtkulov(at)acm(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-patches(at)postgresql(dot)org
Subject:	Re: strpos() && KMP
Date:	2007-08-10 20:27:49
Message-ID:	16410532421.20070811012749@acm.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-patches

Tom Lane writes:

>> hash table?

> I'd think the cost of hashing would render it impractical. Most of the
> papers I've seen on this topic worry about getting single instructions
> out of the search loop --- a hash lookup will cost lots more than that.
> Moreover, you'd lose the guarantee of not-worse-than-linear time,
> because hash lookup can be pathologically bad if you get a lot of hash
> collisions.

compute max_wchar, min_wchar. If (d = max_wchar - min_wchar) < k (for
example, k = 1000), then we use index table (wchar -> wchar -
min_wchar). Else we use hash table. Number of collisions would be a
few (because hash table needs for pattern characters only. Characters
located serially, hash function = whchar % const).

>> The main difficulty with BM is verification and understanding "good
>> suffix shift" (the second part of BM) (I don't understand it entirely).

> Yeah, there seem to be a bunch of variants of BM (many of them not
> guaranteed linear, which I'm sure we don't want) and the earliest
> papers had bugs. But a good implementation would be a lot easier
> sell because it would show benefits for a much wider set of use-cases
> than KMP.

Is there requirement for some string mathching algorithms/data
structure(suffix array/tree) in PG? or "We've
had no complaints about the speed of those functions".

----
Ajtkulov Pavel
ajtkulov(at)acm(dot)org

In response to

Re: strpos() && KMP at 2007-08-09 02:24:20 from Tom Lane

Responses

Re: strpos() && KMP at 2007-08-11 05:15:12 from Tom Lane

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Andrew Dunstan	2007-08-11 02:21:12	final CSVlog patch
Previous Message	Decibel!	2007-08-10 19:47:55	Re: Reduce the size of PageFreeSpaceInfo on 64bit platform