From: | xu fei <autofei(at)yahoo(dot)com> |
---|---|
To: | Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: FTS uses "tsquery" directly in the query |
Date: | 2010-01-25 23:30:01 |
Message-ID: | 362774.80233.qm@web45404.mail.sp1.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi, Ivan:
I agree with you and also would like to 'hack' into the code. Current FTC is the best one in database system and a great building block to support more functions. I list some I can think about:choose "|" or "&" as an optional parameter for to_tsquery, to_tsvector.choose normalization or not for to_tsquery, to_tsvector.current two rankings are not enough: the default ts_rank, I have not figured out the algorithm. The ts_rank_cd, we have the paper but it is designed for short query with 2 or 3 tokens.The normalization may be similar to Apache Lucene which is really easy to modify and build your own tokenizer. I still feel confused after reading the annual.. I am not sure current there is a team to help Oleg Bartunov or not. If need, I can try to do something rather than just hacking it. I am sure, Ivan also will join this. :)Xu
--- On Mon, 1/25/10, Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it> wrote:
From: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>
Subject: Re: [GENERAL] FTS uses "tsquery" directly in the query
To: pgsql-general(at)postgresql(dot)org
Date: Monday, January 25, 2010, 4:33 PM
On Mon, 25 Jan 2010 23:35:12 +0300 (MSK)
Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
> Do you guys wanted something like:
>
> arxiv=# select and2or(to_tsquery('1 & 2 & 3'));
> and2or
> ---------------------
> ( '1' | '2' ) | '3'
> (1 row)
Nearly. I'm starting from a weighted tsvector not from text/tsquery.
I would like to:
- keep the weights in the query
- avoid parsing the text to extract lexemes twice (I already have a
tsvector)
For me extending pg in C is a new science, but I'm actually trying
to write at least a couple of functions that:
- will return a tsvector as a weight int, pos int[], lexeme text
record
- will turn a tsvector + operator into a tsquery
'orange':A1,2,3 'banana':B4,5 'tomato':C6,7 ->
'orange':A | 'banana':B | 'tomato':C
or eventually
'orange':A & 'banana':B & 'tomato':C
thanks
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2010-01-25 23:33:26 | Re: Log full of: statement_timeout out of the valid range. |
Previous Message | Jeff Davis | 2010-01-25 22:24:48 | Re: revoke from all users |