Quick Links

Re: Fastest Index/Algorithm to find similar sentences

From:	Amit Langote <amitlangote09(at)gmail(dot)com>
To:	Janek Sendrowski <janek12(at)web(dot)de>
Cc:	Postgres General <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Fastest Index/Algorithm to find similar sentences
Date:	2013-07-26 05:58:51
Message-ID:	CA+HiwqGXXsX1OdZKv7m4241GiyYg4bDU4rXtaDCW-Ac36ab7ww@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Fri, Jul 26, 2013 at 7:54 AM, Janek Sendrowski <janek12(at)web(dot)de> wrote:
> Hi,
>
> I'm searching for an algorithm/Index to find similar sentences in a database.
>
> The Fulltextsearch is not really suitable because it doesn't have a tolerance.
>
> The Levenshtein-distance ist to slow.
>
> I also tried pg_trgm module, which works with tri-grams, but it's also very slow with 100.000+ rows.
>
> I hope someone can help, I can't really find sth. which is fast enough.
>

Have you tried pg_bigm (a bi-gram based implementation)? It's still in
development phase, but you could give it a try and see if it can
perform better where pg_trgm can not.

--
Amit Langote

In response to

Fastest Index/Algorithm to find similar sentences at 2013-07-25 22:54:34 from Janek Sendrowski

Responses

Re: Fastest Index/Algorithm to find similar sentences at 2013-07-27 00:02:10 from Janek Sendrowski

Browse pgsql-general by date

	From	Date	Subject
Next Message	Samrat Revagade	2013-07-26 06:00:10	Re: Speed up Switchover
Previous Message	John R Pierce	2013-07-26 05:36:30	Re: Tablespace on Postgrsql