Re: Ellipses around result fragment of ts_headline

From: Asher Snyder <asnyder(at)noloh(dot)com>
To: sushant354(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Ellipses around result fragment of ts_headline
Date: 2009-02-14 21:21:28
Message-ID: 009d01c98eea$32cc96d0$9865c470$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Interesting, it could be that you already do it, but the documentation makes
no reference to a fragment delimiter, so there's no way that I can see to
add one. The documentation for ts_headline only lists StartSel, StopSel,
MaxWords, MinWords, ShortWord, and HighlightAll, there appears to be no
option for a fragment delimiter.

In my case I do:

SELECT v1.id, v1.type_id, v1.title, ts_headline(v1.copy, query, 'MinWords =
17') as copy, ts_rank(v1.text_search, query) AS rank FROM
(SELECT b1.*, (setweight(to_tsvector(coalesce(b1.title,'')), 'A')
||
setweight(to_tsvector(coalesce(b1.copy,'')), 'B')) as text_search
FROM search.v_searchable_content b1) v1,
plainto_tsquery($1) query
WHERE ($2 IS NULL OR (type_id = ANY($2))) AND query @@ v1.text_search ORDER
BY rank DESC, title

Now, this use of ts_headline correctly returns me highlighted fragmented
search results, but there will be no fragment delimiter for the headline.
Some suggestions were to change ts_headline(v1.copy, query, 'MinWords = 17')
to '...' || _headline(v1.copy, query, 'MinWords = 17') || '...', but as you
can clearly see this would always occur, and not be intelligent regarding
the fragments. I hope that you're correct and that it is implemented, and
not documented

>-----Original Message-----
>From: Sushant Sinha [mailto:sushant354(at)gmail(dot)com]
>Sent: Saturday, February 14, 2009 4:07 PM
>To: Asher Snyder
>Cc: pgsql-hackers(at)postgresql(dot)org
>Subject: Re: [HACKERS] Ellipses around result fragment of ts_headline
>
>I think we currently do that. We add ellipses only when we encounter a
>new fragment. So there should not be ellipses if we are at the end of
>the document or if that is the first fragment (includes the beginning of
>the document). Here is the code in generateHeadline, ts_parse.c that
>adds the ellipses:
>
> if (!infrag)
> {
>
> /* start of a new fragment */
> infrag = 1;
> numfragments ++;
> /* add a fragment delimitor if this is after the first
>one */
> if (numfragments > 1)
> {
> memcpy(ptr, prs->fragdelim, prs->fragdelimlen);
> ptr += prs->fragdelimlen;
> }
>
> }
>
>It is possible that there is a bug that needs to be fixed. Can you show
>me an example where you found that?
>
>-Sushant.
>
>
>
>
>On Sat, 2009-02-14 at 15:13 -0500, Asher Snyder wrote:
>> It would be very useful if there were an option to have ts_headline
>append
>> ellipses before or after a result fragement based on the position of
>the
>> fragment in the source document. For instance, when running
>ts_headline(doc,
>> query) it will correctly return a fragment with words highlighted,
>however,
>> there's no easy way to determine whether this returned fragment is at
>the
>> beginning or end of the original doc, and add the necessary ellipses.
>>
>> Searches such as postgresql.org ALWAYS add ellipses before or after
>the
>> fragment regardless of whether or not ellipses are warranted. In my
>opinion
>> always adding ellipses to the fragment is deceptive to the user, in
>many of
>> my search result cases, the fragment is at the beginning of the doc,
>and
>> would confuse the user to always see ellipses. So you can see how
>useful the
>> feature described above would be beneficial to the accuracy of the
>search
>> result fragment.
>>
>>
>>
>>
>>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-02-14 21:34:27 Re: Ellipses around result fragment of ts_headline
Previous Message Sushant Sinha 2009-02-14 21:06:30 Re: Ellipses around result fragment of ts_headline