From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Alex Malek <magicagent(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org, ngigi(at)at(dot)co(dot)ke |
Subject: | Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly |
Date: | 2023-10-28 19:42:16 |
Message-ID: | ZT1kGIbiALGclTUA@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Wed, Aug 3, 2022 at 02:02:51PM -0400, Alex Malek wrote:
> On Wed, Aug 3, 2022 at 1:58 PM PG Bug reporting form <noreply(at)postgresql(dot)org>
> wrote:
> I have a noticed a likely bug when using ts_headline with the <-> operator
>
> Assuming the following query:
>
> SELECT ts_headline('English','This Commercial Bank does not have any Equity
> in Europe but European Commercial Bank does',
> phraseto_tsquery('English','European Commercial
> Bank')::tsquery);
>
> The returned result is:
> This <b>Commercial</b> <b>Bank</b> does not have any Equity in Europe but
> <b>European</b> <b>Commercial</b> <b>Bank</b> does
>
> This highlights the words Commercial & Bank separately in addition to
> European Commercial Bank.
>
> However, the correct output expected should be:
> This Commercial Bank does not have any Equity in Europe but <b>European</b>
> <b>Commercial</b> <b>Bank</b> does
>
> Which only highlights *European Commercial Bank* due to the <-> operator in
> phraseto_tsquery.
>
> SELECT phraseto_tsquery('English','European Commercial Bank');
> returns 'european' <-> 'commerci' <-> 'bank' as expected indicating the
> problem is with ts_headline function.
I tested this against Postgres 11 and master (and you tested on PG 10
and 14) and I found the same behavior, plus I found someting even
worse:
SELECT ts_headline('English',
'This Commercial Bank does not have any Equity in Europe but European Commercial Bank does',
('''equiti'' <-> ''bank''')::tsquery);
ts_headline
----------------------------------------------------------------------------------------------------------------
This Commercial <b>Bank</b> does not have any <b>Equity</b> in Europebut European Commercial <b>Bank</b> does
Notice that "Bank" and "Equity" are not next to each other, but they
still highlight. In fact, the words appear to be independently checked:
SELECT ts_headline('English',
'This Commercial Bank does not have any Equity in Europe but European Commercial Bank does',
('''XXX'' <-> ''bank''')::tsquery);
ts_headline
---------------------------------------------------------------------------------------------------------
This Commercial <b>Bank</b> does not have any Equity in Europe but European Commercial <b>Bank</b> does
Is this documented somewhere?
--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-10-28 20:46:40 | Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly |
Previous Message | Sergei Kornilov | 2023-10-28 18:47:40 | Re:BUG #18172: High memory usage in tSRF function context |