From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Pawel Kudzia <kudzia(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows |
Date: | 2021-07-11 16:14:21 |
Message-ID: | 69851f03-d44d-1d90-b174-836ad308e3c9@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 11/07/2021 00:56, Alexander Korotkov wrote:
> I've closer look at shimTriConsistentFn() function. It looks to me
> like the function itself has inconsistencies. But the way it's
> currently used in GIN shouldn't produce the wrong query answers.
>
> shimTriConsistentFn() is one of implementation of
> GinScanKey.triConsistentFn. In turn, GinScanKey.triConsistentFn have
> 3 callers:
> 1) startScanKey() to determine required keys
> 2) keyGetItem() for lossy item check
> 3) keyGetItem() for normal item check
>
> Let's see shimTriConsistentFn() inconsistencies and how callers handle them.
> 1) shimTriConsistentFn() returns result of directBoolConsistentFn()
> for zero maybe entries without examining key->recheckCurItem. That
> may result in returning GIN_TRUE instead of GIN_MAYBE
> 1.1) startScanKey() doesn't care about recheck, just looking for
> GIN_FALSE result.
> 1.2) keyGetItem() for lossy item always implies recheck.
> 1.3) keyGetItem() for a normal item does the trick. Whether a current
> item is rechecked is controlled by key->recheckCurItem variable (the
> same which is set by directBoolConsistentFn(). The following switch
> sets key->recheckCurItem for GIN_MAYBE, but leaves it untouched for
> GIN_TRUE. Thus, GIN_TRUE with key->recheckCurItem works here just
> like GIN_MAYBE. I think this is inconsistent by itself, but this
> inconsistency compensates for inconsistency in shimTriConsistentFn().
> 2) shimTriConsistentFn() might call directBoolConsistentFn() with all
> false inputs for GIN_SEARCH_MODE_DEFAULT. The result is intended to
> be false, but opclass consistent method isn't intended to handle this
> situation correctly. For instance, ginint4_consistent() returns true
> without checking the input array. That could make
> shimTriConsistentFn() return GIN_TRUE instead of GIN_MAYBE, or
> GIN_MAYBE instead of GIN_FALSE.
> 2.1) In principle, this could lead startScanKey() to select less
> required entries than possible. But this is definitely not the case
> of ginint4_consistent() when meeting any of entries is enough for
> match.
> 2.2) In principle, keyGetItem() could get false positives for lossy
> items. But that wouldn't lead to wrong query answers. Again, this is
> not the case of ginint4_consistent().
> 2.3) keyGetItem() for a normal item doesn't call triConsistentFn()
> with any GIN_MAYBE or all GIN_FALSE.
>
> To sum up, I don't see how inconsistencies in shimTriConsistentFn()
> could lead to wrong query answers, especially in intarray GIN index.
Agreed, I came to the same conclusion looking at the code. Which means
that we still don't have an explanation for the original bug report :-(.
> Nevertheless, these inconsistencies should be fixed. I'm going to
> propose a patch soon.
Thanks! I came up with the attached patch. I changed
directBoolConsistentFn() to return a GinTernaryValue rather than bool, I
think that makes the logic in shimTriConsistentFn() more clear.
I also tried to write a test case to expose issue 2.1 above, but
couldn't come up with an example.
- Heikki
Attachment | Content-Type | Size |
---|---|---|
fix-tri-consistent-issues-1.patch | text/x-patch | 7.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2021-07-11 18:42:54 | Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows |
Previous Message | Tom Lane | 2021-07-11 15:28:58 | Re: BUG #17098: Assert failed on composing an error message when adding a type to an extension being dropped |