Re: Ellipses around result fragment of ts_headline

From: Sushant Sinha <sushant354(at)gmail(dot)com>
To: Asher Snyder <asnyder(at)noloh(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Ellipses around result fragment of ts_headline
Date: 2009-02-14 21:40:42
Message-ID: 1234647642.6298.11.camel@dragflick
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The documentation in 8.4dev has information on FragmentDelimiter
http://developer.postgresql.org/pgdocs/postgres/textsearch-controls.html

If you do not specify MaxFragments > 0, then the default headline
generator kicks in. The default headline generator does not have any
fragment delimiter. So it is correct that you will not see any
delimiter.

I think you are looking for the default headline generator to add
ellipses as well depending on where the fragment is. I do not what
other people opinion on this is.

-Sushant.

On Sat, 2009-02-14 at 16:21 -0500, Asher Snyder wrote:
> Interesting, it could be that you already do it, but the documentation makes
> no reference to a fragment delimiter, so there's no way that I can see to
> add one. The documentation for ts_headline only lists StartSel, StopSel,
> MaxWords, MinWords, ShortWord, and HighlightAll, there appears to be no
> option for a fragment delimiter.
>
> In my case I do:
>
> SELECT v1.id, v1.type_id, v1.title, ts_headline(v1.copy, query, 'MinWords =
> 17') as copy, ts_rank(v1.text_search, query) AS rank FROM
> (SELECT b1.*, (setweight(to_tsvector(coalesce(b1.title,'')), 'A')
> ||
> setweight(to_tsvector(coalesce(b1.copy,'')), 'B')) as text_search
> FROM search.v_searchable_content b1) v1,
> plainto_tsquery($1) query
> WHERE ($2 IS NULL OR (type_id = ANY($2))) AND query @@ v1.text_search ORDER
> BY rank DESC, title
>
> Now, this use of ts_headline correctly returns me highlighted fragmented
> search results, but there will be no fragment delimiter for the headline.
> Some suggestions were to change ts_headline(v1.copy, query, 'MinWords = 17')
> to '...' || _headline(v1.copy, query, 'MinWords = 17') || '...', but as you
> can clearly see this would always occur, and not be intelligent regarding
> the fragments. I hope that you're correct and that it is implemented, and
> not documented
>
> >-----Original Message-----
> >From: Sushant Sinha [mailto:sushant354(at)gmail(dot)com]
> >Sent: Saturday, February 14, 2009 4:07 PM
> >To: Asher Snyder
> >Cc: pgsql-hackers(at)postgresql(dot)org
> >Subject: Re: [HACKERS] Ellipses around result fragment of ts_headline
> >
> >I think we currently do that. We add ellipses only when we encounter a
> >new fragment. So there should not be ellipses if we are at the end of
> >the document or if that is the first fragment (includes the beginning of
> >the document). Here is the code in generateHeadline, ts_parse.c that
> >adds the ellipses:
> >
> > if (!infrag)
> > {
> >
> > /* start of a new fragment */
> > infrag = 1;
> > numfragments ++;
> > /* add a fragment delimitor if this is after the first
> >one */
> > if (numfragments > 1)
> > {
> > memcpy(ptr, prs->fragdelim, prs->fragdelimlen);
> > ptr += prs->fragdelimlen;
> > }
> >
> > }
> >
> >It is possible that there is a bug that needs to be fixed. Can you show
> >me an example where you found that?
> >
> >-Sushant.
> >
> >
> >
> >
> >On Sat, 2009-02-14 at 15:13 -0500, Asher Snyder wrote:
> >> It would be very useful if there were an option to have ts_headline
> >append
> >> ellipses before or after a result fragement based on the position of
> >the
> >> fragment in the source document. For instance, when running
> >ts_headline(doc,
> >> query) it will correctly return a fragment with words highlighted,
> >however,
> >> there's no easy way to determine whether this returned fragment is at
> >the
> >> beginning or end of the original doc, and add the necessary ellipses.
> >>
> >> Searches such as postgresql.org ALWAYS add ellipses before or after
> >the
> >> fragment regardless of whether or not ellipses are warranted. In my
> >opinion
> >> always adding ellipses to the fragment is deceptive to the user, in
> >many of
> >> my search result cases, the fragment is at the beginning of the doc,
> >and
> >> would confuse the user to always see ellipses. So you can see how
> >useful the
> >> feature described above would be beneficial to the accuracy of the
> >search
> >> result fragment.
> >>
> >>
> >>
> >>
> >>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sushant Sinha 2009-02-14 21:46:50 Re: Ellipses around result fragment of ts_headline
Previous Message Tom Lane 2009-02-14 21:34:27 Re: Ellipses around result fragment of ts_headline