From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | magicagent(at)gmail(dot)com |
Subject: | BUG #17556: ts_headline does not correctly find matches when separated by 4,999 words |
Date: | 2022-07-22 14:06:43 |
Message-ID: | 17556-70b0479170b83b81@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 17556
Logged by: Alex Malek
Email address: magicagent(at)gmail(dot)com
PostgreSQL version: 14.4
Operating system: Red Hat
Description:
Correct results when 4,998 words separate search terms:
# select ts_headline('baz baz baz ipsum ' || repeat(' foo ',4998) || '
labor',
$$'ipsum' & 'labor'$$::tsquery, 'StartSel=>, StopSel=<,
MaxFragments=100, MaxWords=7, MinWords=3') ;
ts_headline
---------------------
>ipsum< ... >labor<
(1 row)
Add one more word between terms being searched for, to total 4,999, and
terms are not found:
# select ts_headline('baz baz baz ipsum ' || repeat(' foo ',4999) || '
labor',
$$'ipsum' & 'labor'$$::tsquery, 'StartSel=>, StopSel=<,
MaxFragments=100, MaxWords=7, MinWords=3') ;
ts_headline
-------------
baz baz baz
(1 row)
Works correctly if "&" (AND) is replaced by "|" (OR)
# select ts_headline('baz baz baz ipsum ' || repeat(' foo ',4999) || '
labor',
$$'ipsum' | 'labor'$$::tsquery, 'StartSel=>, StopSel=<,
MaxFragments=100, MaxWords=7, MinWords=3') ;
ts_headline
---------------------
>ipsum< ... >labor<
(1 row)
The "MinWords" argument and the number of words before the first term being
searched for alters the results:
Removing one word before the first search term and ts_headline will match
first term:
# select ts_headline('baz baz ipsum ' || repeat(' foo ',4999) || ' labor',
$$'ipsum' & 'labor'$$::tsquery, 'StartSel=>, StopSel=<,
MaxFragments=100, MaxWords=7, MinWords=3') ;
ts_headline
-----------------
baz baz >ipsum<
(1 row)
Now reducing MinWords from 3 to 2 and terms are once again not found:
# select ts_headline('baz baz ipsum ' || repeat(' foo ',4999) || ' labor',
$$'ipsum' & 'labor'$$::tsquery, 'StartSel=>, StopSel=<,
MaxFragments=100, MaxWords=7, MinWords=2') ;
ts_headline
-------------
baz baz
(1 row)
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2022-07-22 15:39:42 | BUG #17557: ts_headline will error with "invalid memory alloc request size" for large documents |
Previous Message | Zsolt Ero | 2022-07-22 09:24:54 | could not link file in wal restore lines |