Re: TSearch queries with multiple languages

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gordon Callan <gordon_callan(at)hotmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: TSearch queries with multiple languages
Date: 2009-02-13 00:07:02
Message-ID: 19463.1234483622@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Gordon Callan <gordon_callan(at)hotmail(dot)com> writes:
> Next we create an index on the ts_vector column:
> CREATE INDEX node_ts_body on node USING gin(ts_body);

> From the documentation, it seems this index will know what config each row has.

No, actually the index doesn't know and doesn't care. The tsvector
representation is language-independent --- it contains "just strings".
All the language-dependent processing happens during reduction of the
document text to tsvector (or reduction of a search string to tsquery).
So if words from different languages happen to reduce to the same
string, searches in both languages will find that entry.

Usually this works the way people want; but if not, you could add an
additional WHERE condition to your queries to match only documents in
the desired language.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2009-02-13 03:13:05 Re: audit table
Previous Message Gordon Callan 2009-02-12 23:38:40 TSearch queries with multiple languages