Re: websearch_to_tsquery() returns queries that don't match to_tsvector()

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Valentin Gatien-Baron <valentin(dot)gatienbaron(at)gmail(dot)com>, Teodor Sigaev <teodor(at)postgrespro(dot)ru>, Oleg Bartunov <obartunov(at)postgrespro(dot)ru>
Subject: Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
Date: 2021-05-02 17:45:18
Message-ID: CAPpHfdsKy5TzOTq5aV8tn+KQEd_C5mF0Sd_BrZ0e3+wGY5tLFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hi!

On Mon, Apr 19, 2021 at 9:57 AM Valentin Gatien-Baron
<valentin(dot)gatienbaron(at)gmail(dot)com> wrote:
> Looking at the tsvector and tsquery, we can see that the problem is
> that the ":" counts as one position for the ts_query but not the
> ts_vector:
>
> select to_tsvector('english', 'aaa: bbb'), websearch_to_tsquery('english', '"aaa: bbb"');
> to_tsvector | websearch_to_tsquery
> -----------------+----------------------
> 'aaa':1 'bbb':2 | 'aaa' <2> 'bbb'
> (1 row)

It seems there is another bug with phrase search and query parsing.
It seems to me that since 0c4f355c6a websearch_to_tsquery() should
just parse text in quotes as a single token. Besides fixing this bug,
it simplifies the code.

Trying to fix this bug before 0c4f355c6a doesn't seem to worth the efforts.

I propose to push the attached patch to v14. Objections?

------
Regards,
Alexander Korotkov

Attachment Content-Type Size
0001-Make-websearch_to_tsquery-parse-text-in-quotes-as-a-.patch application/octet-stream 10.4 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-05-02 17:52:27 Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
Previous Message Alexander Korotkov 2021-05-02 16:26:29 Re: BUG #16986: reindex error on ltree index

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-05-02 17:52:27 Re: websearch_to_tsquery() returns queries that don't match to_tsvector()
Previous Message Tom Lane 2021-05-02 16:53:16 Re: Regex performance regression induced by match-all code