From: | karavelov(at)mail(dot)bg |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: TS: Limited cover density ranking |
Date: | 2012-01-29 01:41:35 |
Message-ID: | 3c8f10c1e2ffb295f4013e020723249f.mailbg@mail.bg |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
----- Цитат от Oleg Bartunov (oleg(at)sai(dot)msu(dot)su), на 28.01.2012 в 21:04 -----
> I suggest you work on more general approach, see
> http://www.sai.msu.su/~megera/wiki/2009-08-12 for example.
>
> btw, I don't like you changed ts_rank_cd arguments.
Hello Oleg,
Thanks for the feedback.
Is it OK to begin with adding an exta argument and check in calc_rank_cd?
I could change the function names in order not to overload ts_rank_cd
arguments. My proposition is :
at sql level:
ts_rank_lcd([weights], tsvector, tsquery, limit, [method])
at C level:
ts_ranklcd_wttlf
ts_ranklcd_wttl
ts_ranklcd_ttlf
ts_ranklcd_ttl
Adding the functions could be done as an extension but they are just
trampolines into calc_rank_cd().
I agree that what you describe in the wiki page is more general approach. So this :
SELECT ts_rank_lcd(to_tsvector('a b c'), to_tsquery('a&c'),2 )>0;
could be replaced with
SELECT to_tsvector('a b c') @@ to_tsquery('(a ?2 c)|(c ?2 a) ');
but if we need to look for 3 or more nearby terms without order the tsquery
with '?' operator will became quite complicated. For example
SELECT tsvec @@
'(a ? b ? c) | (a ? c ? b) | (b ? a ? c) | (b ? c ? a) | (c ? a ? b) | (c ? b ? a)'::tsquery;
is the same as
SELECT ts_rank_lcd(tsvec, 'a&b&c'::tsquery,2)>0;
So this is the reason to think that the general approach does not exclude the the
usefulness of the approach that I am proposing.
Best regards
--
Luben Karavelov
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2012-01-29 07:47:28 | Index-only scan performance regression |
Previous Message | Tom Lane | 2012-01-29 01:06:36 | Re: pg_dumpall and temp_tablespaces dependency problem |