FTS phrase searches

From: Glenn Maynard <glenn(at)zewt(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: FTS phrase searches
Date: 2010-11-01 20:35:59
Message-ID: AANLkTinnUH_JHY8WeTnA59fkR6zczL_UZj5EA2+VmX5z@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

How are adjacent word searches handled with FTS? tsquery doesn't do
this, so I assume this has to be done as a separate filter step, eg.:

# "large house" sales
SELECT * FROM data WHERE fts @@ to_tsquery('large & house & sales')
AND tsvector_contains_phrase(fts, to_tsvector('large house')));

to do an indexed search for "large & house & sales" and then to narrow
the results to where "large house" actually appears as a phrase (eg.
adjacent positions at the same weight). I can't find any function to
do that, though. (Presumably, it would return true if all of the
words in the second tsvector exist in the first, with the same
positions relative to each other.)

"tsvector <@ tsvector" seems logical, but isn't supported.

This isn't as simple as using LIKE, since that'll ignore stemming,
tokenization rules, etc. If the language rules allow this to match
"larger house" or "large-house", then a phrase restriction should,
too. It's also painful when the FTS column is an aggregate of several
other columns (eg. title and body), since a LIKE match needs to know
that and check all of them separately.

Any hints? This is pretty important to even simpler search systems.

--
Glenn Maynard

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2010-11-01 20:39:20 Re: can select contents of view but not view itself, despite indirect membership
Previous Message Jonathan Tripathy 2010-11-01 20:29:28 Re: Replication