Quick Links

Re: strpos() && KMP

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Pavel Ajtkulov <ajtkulov(at)acm(dot)org>
Cc:	pgsql-patches(at)postgresql(dot)org
Subject:	Re: strpos() && KMP
Date:	2007-08-11 05:15:12
Message-ID:	22661.1186809312@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-patches

Pavel Ajtkulov <ajtkulov(at)acm(dot)org> writes:
> Tom Lane writes:
>> Moreover, you'd lose the guarantee of not-worse-than-linear time,
>> because hash lookup can be pathologically bad if you get a lot of hash
>> collisions.

> compute max_wchar, min_wchar. If (d = max_wchar - min_wchar) < k (for
> example, k = 1000), then we use index table (wchar -> wchar -
> min_wchar). Else we use hash table. Number of collisions would be a
> few (because hash table needs for pattern characters only.

I think you missed my point: there's a significant difference between
"guaranteed good performance" and "probabilistically good performance".
Even when the probably-good algorithm wins for typical cases, there's a
strong argument to be made for guarantees. The problem you set out to
solve really is that an algorithm that's all right in everyday cases
will suck in certain uncommon cases --- so why do you want to fix it
by just moving around the cases in which it fails to do well?

regards, tom lane

In response to

Re: strpos() && KMP at 2007-08-10 20:27:49 from Pavel Ajtkulov

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Pavel Stehule	2007-08-11 18:00:48	ON DELETE SET NULL clauses do error when more than two columns are referenced to one table
Previous Message	Andrew Dunstan	2007-08-11 02:21:12	final CSVlog patch