Similarity search for sentences

From: "Janek Sendrowski" <janek12(at)web(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Similarity search for sentences
Date: 2013-12-05 11:51:55
Message-ID: trinity-b6932efc-dca4-4f7e-9ed4-dc5ae43701bf-1386244315450@3capp-webde-bs37
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,
 
I have tables with millions of sentences. Each row contains a sentence. It is natural language and every language is possible, but the sentences of one table have the same language.
I have to do a similarity search on them. It has to be very fast, because I have to search for a few hundert sentences many times.
The search shouldn't be context-based. It should just get sentences with similar words(maybe stemmed).
 
I already had a try with gist/gin-index-based trigramm search (pg_trgm extension), fulltextsearch (tsearch2 extension) and a pivot-based indexing (Fixed Query Array), but it's all to slow or not suitable.
Soundex and Metaphone aren't suitable, as well.
 
I'm already working on this project since a long time, but without any success.
Do any of you have an idea?
 
I would be very thankful for help.
 
Janek Sendrowski

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rémi Cura 2013-12-05 12:12:55 Re: Similarity search for sentences
Previous Message Shuwn Yuan Tee 2013-12-05 09:54:18 Re: Postgres 9.3 read block error went into recovery mode