Quick Links

Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)
Date:	2010-10-13 15:53:35
Message-ID:	AANLkTik5AkOahj3GL6ssZCOaRWgSmcc=Pp3kj4vdnZyb@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> No doubt, but the actual function runtime is only one component of the
> cost of applying it to a lot of dictionary entries --- I would think
> that the table read costs are the larger component anyway.

Data domain can be not only dictionary but also something like article
titles, urls and so on. On such relatively long strings (about 100
characters and more) this component will be significant (especially if most
part of the table is lying in cache). In this case search of near strings
can be accelerated in more than 10 times. I think that this use case
justifies presence of separate leveshtein_less_equal function.

----
With best regards,
Alexander Korotkov.

In response to

Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function) at 2010-10-13 15:42:28 from Tom Lane

Responses

Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function) at 2010-10-13 15:58:32 from Alexander Korotkov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexander Korotkov	2010-10-13 15:58:32	Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)
Previous Message	Tom Lane	2010-10-13 15:45:51	Re: leaky views, yet again