Quick Links

Re: english parser in text search: support for multiple words in the same position

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	sushant354(at)gmail(dot)com
Cc:	Markus Wanner <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: english parser in text search: support for multiple words in the same position
Date:	2010-08-02 14:20:04
Message-ID:	15782.1280758804@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Sushant Sinha <sushant354(at)gmail(dot)com> writes:
>> This would needlessly increase the number of tokens. Instead you'd
>> better make it work like compound word support, having just "wikipedia"
>> and "org" as tokens.

> The current text parser already returns url and url_path. That already
> increases the number of unique tokens. I am only asking for adding of
> normal english words as well so that if someone types only "wikipedia"
> he gets a match.

The suggestion to make it work like compound words is still a good one,
ie given wikipedia.org you'd get back

host wikipedia.org
host-part wikipedia
host-part org

not just the "host" token as at present.

Then the user could decide whether he needed to index hostname
components or not, by choosing whether to forward hostname-part
tokens to a dictionary or just discard them.

If you submit a patch that tries to force the issue by classifying
hostname parts as plain words, it'll probably get rejected out of
hand on backwards-compatibility grounds.

regards, tom lane

In response to

Re: english parser in text search: support for multiple words in the same position at 2010-08-02 13:12:50 from Sushant Sinha

Responses

Re: english parser in text search: support for multiple words in the same position at 2010-09-01 06:42:04 from Sushant Sinha

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kevin Grittner	2010-08-02 14:21:55	Re: english parser in text search: support for multiple words in the same position
Previous Message	Sushant Sinha	2010-08-02 13:59:43	Re: english parser in text search: support for multiple words in the same position