From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function) |
Date: | 2010-10-13 15:53:35 |
Message-ID: | AANLkTik5AkOahj3GL6ssZCOaRWgSmcc=Pp3kj4vdnZyb@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> No doubt, but the actual function runtime is only one component of the
> cost of applying it to a lot of dictionary entries --- I would think
> that the table read costs are the larger component anyway.
Data domain can be not only dictionary but also something like article
titles, urls and so on. On such relatively long strings (about 100
characters and more) this component will be significant (especially if most
part of the table is lying in cache). In this case search of near strings
can be accelerated in more than 10 times. I think that this use case
justifies presence of separate leveshtein_less_equal function.
----
With best regards,
Alexander Korotkov.
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2010-10-13 15:58:32 | Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function) |
Previous Message | Tom Lane | 2010-10-13 15:45:51 | Re: leaky views, yet again |