Re: Ranking search results using multiple fields in PostgreSQL fulltext search

From: Gaini Rajeshwar <raja(dot)rajeshwar2006(at)gmail(dot)com>
To: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Ranking search results using multiple fields in PostgreSQL fulltext search
Date: 2009-10-13 06:12:47
Message-ID: 56b36eb60910122312o1684d6dcrbd1f2e966e47ba0d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Oct 12, 2009 at 8:02 PM, Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it
> wrote:

> On Mon, 12 Oct 2009 19:26:55 +0530
> Gaini Rajeshwar <raja(dot)rajeshwar2006(at)gmail(dot)com> wrote:
>
> > Ivan,
> > If i create a tsvector as you mentioned with concatenation
> > operator, my search query will search in any of these fields which
> > are concatenated in my tsvector.
> > For example, if i create tsvector like this,
> > UPDATE document_table SET search_col =
> > setweight(to_tsvector(coalesce(title,'')), 'A') ||
> > setweight(to_tsvector(coalesce(summary,'')), 'B'));
> >
> > and do a query like this
> > select title, ts_rank(search_col, to_tsquery('this is my text
> > search') AS rank
> > FROM search_col @@ to_tsvector('this & is & my & text & search')
> > ORDER BY rank DESC
> > the above query will search in title and summary and will give me
> > the results. But i dont want in that way.When a user wants to
> > search in title, it should just search in title but the results
> > should be ranked based on * title* and *summary* field.
>
> Search *just* in title specifying the weight in the input query and
> rank on title and summary.
>
> /*
> -- somewhere else in your code...
> search_col := setweight(cfg, title, 'A', '&');
> search_col := search_col && setweight(cfg, summary, 'B', '&');
> */
>
>
> select rank(search_col, to_tsquery(inputtitle)) as rank
> -- rank on both if search_col just contains title and summary
> ...
> where search_col @@ setweight(cfg, inputtitle, 'A', '&')
> -- return just matching title
> order by ts_rank(...)
>
Yes, it is true.but there is bit difficulty in using this method to my
application. As i want to rank results based on many fields, if i
concatenate all these fields into search_col, it can be a problematic. It
will be problematic, because *PostgreSQL by default supports 256 positions
for lexeme and 1MB for ts_vector() size*. If i concatenate in this way, then
it can be a very much lossy, and my ranking may not be perfect.
Instead of that way, i am just wondering if i can specify manually more than
one fields in the ts_rank() function itself, rather than specifying *
search_col* which is prepared by contactenating other fields.
I hope, i am clear from my side. let me know if i am not making sense.

> is it what you need?
>

>
> This is just one of the possible way to rank something...
>
> otherwise: really understand how rank is computed, keep
> columns/ts_vector separated, compute rank for each column and pass
> the result to some magic function that will compute a "cumulative"
> ranking...
> Or you could write your own ts_rank... but I tend to trust Oleg and
> common practice with pg rather than inventing my own ranking
> function.
>
> Right now ts_rank* are black boxes for me. I envisioned I may enjoy
> some finer tuning on ranking... but currently they really do a good
> job.
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2009-10-13 06:14:47 Re: Cannot start the postgres service
Previous Message Gaini Rajeshwar 2009-10-13 05:35:42 Re: Are there only 4 weights in PostgreSQL fulltext search?