On Fri, Jul 26, 2013 at 7:54 AM, Janek Sendrowski <janek12(at)web(dot)de> wrote:
> Hi,
>
> I'm searching for an algorithm/Index to find similar sentences in a database.
>
> The Fulltextsearch is not really suitable because it doesn't have a tolerance.
>
> The Levenshtein-distance ist to slow.
>
> I also tried pg_trgm module, which works with tri-grams, but it's also very slow with 100.000+ rows.
>
> I hope someone can help, I can't really find sth. which is fast enough.
>
Have you tried pg_bigm (a bi-gram based implementation)? It's still in
development phase, but you could give it a try and see if it can
perform better where pg_trgm can not.
--
Amit Langote