Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alex Malek <magicagent(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, ngigi(at)at(dot)co(dot)ke
Subject: Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly
Date: 2023-10-28 23:34:42
Message-ID: ZT2akojpqo9rLiV_@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Oct 28, 2023 at 04:46:40PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Is this documented somewhere?
>
> The docs [1] only say that ts_headline "returns an excerpt from the
> document in which terms from the query are highlighted". This
> behavior does not violate that admittedly-weak contract.
>
> IIRC, ts_headline does attempt to find a text fragment or fragments
> that fully satisfy the query (e.g., include an exact phrase match)
> but it will then highlight all the matching words in the fragment,
> not only the location of the phrase match. I do not agree with the

I see what you mean in this query output:

SELECT ts_headline('English','kj asdlkjf alds jflkasjd flkaj dsflkja sdlfk jaslfd kjasdlfkj salfdkj This Commercial Bank does not have any Equity in Europe but European Commercial Bank does lkj sadlkjf asldkjf alskjd flsakj fdlkaj dfaslkfd jlakds jaslkfdj',
('''european'' <-> ''commerci'' <-> ''bank''')::tsquery);
ts_headline
---------------------------------------------------------------------------------------------------------------------------------
Europe but <b>European</b> <b>Commercial</b> <b>Bank</b> does lkj sadlkjf asldkjf alskjd flsakj fdlkaj dfaslkfd jlakds jaslkfdj

The query controls the fragment chosen.

> OP's opinion that that's wrong. The highlight-em-all approach has its
> own value, and in any case it may not be possible to find a full match
> that satisfies the function's other constraints such as MaxWords.
> Refusing to highlight anything in that event would be unhelpful.

Attached is a proposed doc patch.

I hope people don't mind me addressing these old emails but I think
they address important issues, and while I wasn't able to deal with them
when they are posted, I have time for the next month to do so.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

Attachment Content-Type Size
headline.diff text/x-diff 923 bytes

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bruce Momjian 2023-10-28 23:39:56 Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly
Previous Message Alexander Korotkov 2023-10-28 22:39:40 Re: BUG #18170: Unexpected error: no relation entry for relid 3