FTS with more than one language in body and with unknown query language?

From: Stefan Keller <sfkeller(at)gmail(dot)com>
To: pgsql-general List <pgsql-general(at)postgresql(dot)org>
Subject: FTS with more than one language in body and with unknown query language?
Date: 2016-07-13 22:16:21
Message-ID: CAFcOn29UxC2LpgopY-7-NC=3=9Vzu6-STXjDEimW5uMbE4fTkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I have a text corpus which contains either German or English docs and
I expect queries where I don't know if it's German or English. So I'd
like e.g. that a query "forest" matches "forest" in body_en but also
"Wald" in body_de.

I created a table with attributes body_en and body_de (type "text"). I
will use ts_vector/ts_query on the fly (don't need yet an index
(attributes)).

* Can FTS handle this multilingual situation?
* How to setup a text search configuration which e.g. stems en and de words?
* Should I create a synonym dictionary which contains word
translations en-de instead of synonyms en-en?
* Any hints to related work where FTS has been used in a multilingual context?

:Stefan

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Patrick B 2016-07-13 22:31:56 Tracking IO Queries
Previous Message Tom Lane 2016-07-13 22:06:34 Re: postgresql "init script" for postgres 9.2.15