From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Gordon Callan <gordon_callan(at)hotmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: TSearch queries with multiple languages |
Date: | 2009-02-13 00:07:02 |
Message-ID: | 19463.1234483622@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Gordon Callan <gordon_callan(at)hotmail(dot)com> writes:
> Next we create an index on the ts_vector column:
> CREATE INDEX node_ts_body on node USING gin(ts_body);
> From the documentation, it seems this index will know what config each row has.
No, actually the index doesn't know and doesn't care. The tsvector
representation is language-independent --- it contains "just strings".
All the language-dependent processing happens during reduction of the
document text to tsvector (or reduction of a search string to tsquery).
So if words from different languages happen to reduce to the same
string, searches in both languages will find that entry.
Usually this works the way people want; but if not, you could add an
additional WHERE condition to your queries to match only documents in
the desired language.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2009-02-13 03:13:05 | Re: audit table |
Previous Message | Gordon Callan | 2009-02-12 23:38:40 | TSearch queries with multiple languages |