Re: BUG #16865: Regression: GIN Negated prefix search returns results that contain the search term

From: Dimitri Nüscheler <dimitri(dot)nuescheler(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Subject: Re: BUG #16865: Regression: GIN Negated prefix search returns results that contain the search term
Date: 2021-02-16 08:25:34
Message-ID: CABQ6qDywYdYeb2ong4Qvv75saijE2ZVgK-TErQkGuN==F8=7Dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Tom

Wow, this is going fast. I applied the second patch and it fixes my
original test case on my private data.
Thank you also for the explanations. I'm happy it's possible to get insight
into PostgreSQL interna so quickly :-)

Let me know if there's more I can do (testing particular cases or patch
variants etc).

Kind regards
Dimitri

Am Di., 16. Feb. 2021 um 01:27 Uhr schrieb Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> I wrote:
> > Anyway, this seems to put the final nail in the coffin of the idea
> > that it's sufficient for TS_execute to return a boolean. At least
> > the tsginidx.c callers really need to see the 3-way result. Hence
> > I propose the attached patch. It fixes the given test case, but
> > I wonder if you can try it and see if you see any remaining problems.
>
> After further reflection, it seems like now that we've done that,
> we should go all the way and replace the out-of-band recheck result
> from checkcondition_gin with a MAYBE result, as attached.
>
> AFAICT the previous patch didn't leave any user-visible bug,
> but there is an inefficiency in using the out-of-band signaling.
> Consider a query like
>
> (a & b:B) | c
>
> and suppose that some index entry has "b" and "c" but not "a".
> With the code as it stands, when checkcondition_gin checks for a
> match to "b:B" it will find one, and then because of the weight
> restriction it will return TS_MAYBE, and also set need_recheck.
> Then TS_execute_recurse will combine the TS_MAYBE with the TS_NO
> for "a" to produce TS_NO, and finally combine that with TS_YES
> for "c" to get TS_YES. This is a correct result: the row
> unquestionably matches the query. But since we set the
> need_recheck flag, we'll force a heap visit and recheck anyway.
> That's bogus, and we don't need to accept it anymore now that
> the data return path is fully ternary-clean. Returning TS_MAYBE
> is sufficient to do all the right things, so I propose the
> attached v2.
>
> regards, tom lane
>
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David G. Johnston 2021-02-16 14:10:58 Re: BUG #16868: Cannot find sqlstat error codes.
Previous Message Ivan Salvato 2021-02-16 07:07:21 Re: BUG #16866: pg_basebackup Windows Server 2016