Re: BUG #17691: Unexpected behaviour using ts_headline()

From: sebastian(dot)patino-lang(at)posteo(dot)net
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #17691: Unexpected behaviour using ts_headline()
Date: 2022-11-20 12:03:39
Message-ID: 76717862-8936-41cc-80b0-f8bc0593a0c4@Spark
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi all,

Tom Lane made me aware that the link is not working and its better anyway to include all data in the bug report directly. Please find the file attached.

@Tom: thanks for the hint!

Regards,
Sebastian
On 19. Nov 2022, 14:05 +0100, PG Bug reporting form <noreply(at)postgresql(dot)org>, wrote:
> The following bug has been logged on the website:
>
> Bug reference: 17691
> Logged by: Sebastian Patino-Lang
> Email address: sebastian(dot)patino-lang(at)posteo(dot)net
> PostgreSQL version: 13.9
> Operating system: x86_64-apple-darwin19.6.0
> Description:
>
> I experience unexpected behaviour when using ts_headline() in general, but
> especially when changing MaxFragments. Given the data in
> ts_headline_report.sql [1]
>
> SELECT id,
> ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
> world'), 'StartSel=<<, StopSel=>>') AS "preview"
> FROM texts;
>
> id=1: Highlight word is the first one in the result. Expectation: highlight
> word is somewhere in the middle.
> id=2: No highlight word at all.
> id=3: Highlight words are the first and last one in the result. Not ideal
> but ok-ish.
>
> SELECT id,
> ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
> world'), 'MaxFragments=1, StartSel=<<, StopSel=>>') AS "preview"
> FROM texts;
>
> id=1: Highlight word is now in the middle of the result. This is ok.
> id=2: No highlight word at all.
> id=3: Highlight words are in the middle part of the result. This is ok.
>
> SELECT id,
> ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
> world'), 'MaxFragments=2, StartSel=<<, StopSel=>>') AS "preview"
> FROM texts;
>
> id=1: Correct number of fragments (2) with highlight words are returned.
> id=2: No highlight word at all.
> id=3: Correct number of fragments (2) with highlight words are returned.
>
> SELECT id,
> ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
> world'), 'MaxFragments=3, StartSel=<<, StopSel=>>') AS "preview"
> FROM texts;
>
> id=1: Wrong number of fragments (2) with highlight words are returned.
> id=2: No highlight word at all.
> id=3: Correct number of fragments (3) with highlight words are returned.
>
> SELECT id,
> ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
> world'), 'MaxFragments=4, StartSel=<<, StopSel=>>') AS "preview"
> FROM texts;
>
> id=1: Wrong number of fragments (2) with highlight words are returned.
> id=2: No highlight word at all.
> id=3: Correct number of fragments (4) with highlight words are returned.
>
> ... and so on. Until MaxFragments=6 where for id=1 suddenly more fragments
> (4) get returned.
>
> SELECT id,
> ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
> world'), 'MaxFragments=4, StartSel=<<, StopSel=>>') AS "preview"
> FROM texts;
>
> id=1: Wrong number of fragments (4) with highlight words are returned.
> id=2: No highlight word at all.
> id=3: Correct number of fragments (6) with highlight words are returned.
>
> ... and so on. Until MaxFragments=11 where for id=1 the number of returned
> fragments changes again (5)
>
> SELECT id,
> ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
> world'), 'MaxFragments=11, StartSel=<<, StopSel=>>') AS "preview"
> FROM texts;
>
> id=1: Wrong number of fragments (5) with highlight words are returned.
> id=2: No highlight word at all.
> id=3: Correct number of fragments (11) with highlight words are returned.
>
> ... and so on. Until MaxFragments=14 where for id=2 suddenly more fragments
> (2) get returned.
>
> SELECT id,
> ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
> world'), 'MaxFragments=14, StartSel=<<, StopSel=>>') AS "preview"
> FROM texts;
>
> id=1: Wrong number of fragments (5) with highlight words are returned.
> id=2: Wrong number of fragments (2) with highlight words are returned.
> id=3: Correct number of fragments (11) with highlight words are returned.
>
> I stopped testing here, but im sure the strange behaviour and jumps in
> fragment count will continue.
>
> Any ideas?
>
> [1] https://1drv.ms/u/s!AqRGi9iGBWddgZVU_M2iuoRdTzM6tg?e=0OHHnB
>

Attachment Content-Type Size
ts_headline_report.sql application/octet-stream 760.8 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2022-11-20 19:29:51 Re: BUG #17691: Unexpected behaviour using ts_headline()
Previous Message Tom Lane 2022-11-19 19:27:09 Re: BUG #17691: Unexpected behaviour using ts_headline()