pgsql: Fix parsing of complex morphs to tsquery

From: Alexander Korotkov <akorotkov(at)postgresql(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix parsing of complex morphs to tsquery
Date: 2021-01-31 17:17:45
Message-ID: E1l6GM5-0007xS-Iu@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix parsing of complex morphs to tsquery

When to_tsquery() or websearch_to_tsquery() meet a complex morph containing
multiple words residing adjacent position, these words are connected
with OP_AND operator. That leads to surprising results. For instace,
both websearch_to_tsquery('"pg_class pg"') and to_tsquery('pg_class <-> pg')
produce '( pg & class ) <-> pg' tsquery. This tsquery requires
'pg' and 'class' words to reside on the same position and doesn't match
to to_tsvector('pg_class pg'). It appears to be ridiculous behavior, which
needs to be fixed.

This commit makes to_tsquery() or websearch_to_tsquery() connect words
residing adjacent position with OP_PHRASE. Therefore, now those words are
normally chained with other OP_PHRASE operator. The examples of above now
produces 'pg <-> class <-> pg' tsquery, which matches to
to_tsvector('pg_class pg').

Another effect of this commit is that complex morph word positions now need to
match the tsvector even if there is no surrounding OP_PHRASE. This behavior
change generally looks like an improvement but making this commit not
backpatchable.

Reported-by: Barry Pederson
Bug: #16592
Discussion: https://postgr.es/m/16592-70b110ff9731c07d@postgresql.org
Discussion: https://postgr.es/m/CAPpHfdv0EzVhf6CWfB1_TTZqXV_2Sn-jSY3zSd7ePH%3D-%2B1V2DQ%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Tom Lane, Neil Chen

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/0c4f355c6a5fd437f71349f2f3d5d491382572b7

Modified Files
--------------
src/backend/tsearch/to_tsany.c | 41 ++++++++-
src/test/regress/expected/tsearch.out | 152 +++++++++++++++++-----------------
src/test/regress/sql/tsearch.sql | 36 ++++----
3 files changed, 132 insertions(+), 97 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Geoghegan 2021-01-31 18:11:27 pgsql: Remove unused _bt_delitems_delete() argument.
Previous Message Peter Eisentraut 2021-01-30 18:48:31 pgsql: Add primary keys and unique constraints to system catalogs