From: | Alexander Korotkov <akorotkov(at)postgresql(dot)org> |
---|---|
To: | pgsql-committers(at)lists(dot)postgresql(dot)org |
Subject: | pgsql: Make websearch_to_tsquery() parse text in quotes as a single tok |
Date: | 2021-05-03 01:19:44 |
Message-ID: | E1ldNFQ-00037b-IZ@gemulon.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
Make websearch_to_tsquery() parse text in quotes as a single token
websearch_to_tsquery() splits text in quotes into tokens and connects them with
phrase operator on its own. However, that leads to surprising results when the
token contains no words.
For instance, websearch_to_tsquery('"aaa: bbb"') is 'aaa <2> bbb', because
it is equivalent of to_tsquery(E'aaa <-> \':\' <-> bbb'). But
websearch_to_tsquery('"aaa: bbb"') has to be 'aaa <-> bbb' in order to match
to_tsvector('aaa: bbb').
Since 0c4f355c6a, we anyway connect lexemes of complex tokens with phrase
operators. Thus, let's just websearch_to_tsquery() parse text in quotes as
a single token. Therefore, websearch_to_tsquery() should process the quoted
text in the same way phraseto_tsquery() does. This solution is what we exactly
need and also simplifies the code.
This commit is an incompatible change, so we don't backpatch it.
Reported-by: Valentin Gatien-Baron
Discussion: https://postgr.es/m/CA%2B0DEqiZs7gdOd4ikmg%3D0UWG%2BSwWOLxPsk_JW-sx9WNOyrb0KQ%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Tom Lane, Zhihong Yu
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/eb086056fec44516efdd5db71244a079fed65c7f
Modified Files
--------------
src/backend/utils/adt/tsquery.c | 81 ++++++++++-------------------------
src/test/regress/expected/tsearch.out | 24 +++++++----
src/test/regress/sql/tsearch.sql | 1 +
3 files changed, 39 insertions(+), 67 deletions(-)
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2021-05-03 02:03:42 | pgsql: Fix the computation of slot stats for 'total_bytes'. |
Previous Message | Bruce Momjian | 2021-05-01 14:42:56 | pgsql: Revert use singular for -1 (commits 9ee7d533da and 5da9868ed9 |