From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pawel Kudzia <kudzia(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows |
Date: | 2021-06-17 21:55:56 |
Message-ID: | CAPpHfdt5seb3UC7M8nGpNURSV=LqqBWaSs3To3YV_pxUXjA8Qg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, Jun 17, 2021 at 10:49 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
> > Pawel Kudzia <kudzia(at)gmail(dot)com> writes:
> >> with help from IRC we've found that decreasing work_mem from 1MB to 256kB
> >> or less makes the problem go away:
>
> > Hmm. So that suggests that the index itself is *not* corrupt,
> > but the problem is associated with a bug in the indexscan
> > algorithms.
>
> After staring at the code I think there is at least one bug, and
> possibly two, in shimTriConsistentFn. That's likely to be implicated
> here because intarray's GIN opclass only provides a bool consistent
> function. I'm not very clear on the circumstances that lead to the scan
> code inventing GIN_MAYBE inputs, so I haven't been able to construct a
> test case.
>
> The definite bug is triggered because intarray relies on the API
> specification that says that if the search mode is
> GIN_SEARCH_MODE_DEFAULT, then the consistentFn will only be called
> when there's at least one TRUE key:
>
> case RTOverlapStrategyNumber:
> /* result is not lossy */
> *recheck = false;
> /* at least one element in check[] is true, so result = true */
> res = true;
> break;
>
> shimTriConsistentFn ignores this and calls it on all-FALSE inputs,
> for which it'll incorrectly get a TRUE result, as it will also for
> all the following checks. The upshot is that shimTriConsistentFn
> will convert any case with a MAYBE input to a hard TRUE result,
> with no recheck requirement. This'd easily explain the reported
> misbehavior. So in the spot where we do this:
>
> /* First call consistent function with all the maybe-inputs set FALSE */
> for (i = 0; i < nmaybe; i++)
> key->entryRes[maybeEntries[i]] = GIN_FALSE;
> curResult = directBoolConsistentFn(key);
>
> we need to add some code that checks for default searchMode, and in
> that case doesn't call the consistentFn unless at least one
> (non-MAYBE) input is TRUE.
+1,
I think in default search mode we can just start with curResult equal
to GIN_FALSE instead of calling directBoolConsistentFn().
> The other thing that I'm unsure about, because the code is horribly
> underdocumented, is that it's not very clear whether
> shimTriConsistentFn is accurately converting between the bool and
> the tristate APIs. That's because it's not very clear what the
> tristate API actually *is*. In particular, is the end state of
> key->recheckCurItem meaningful in the tristate case? If it's not,
> then the short-circuit case for no MAYBE inputs is broken, because
> it'll return TRUE when the bool consistentFn is trying to tell us
> to recheck. But if it is meaningful, then the multiway case is broken,
> because it will return the recheckCurItem result set by the last call to
> the bool consistentfn; which might be false even though other calls said
> true. (So in that case I think we'd need "key->recheckCurItem = recheck"
> at the end.) I also wonder how any of that works correctly for real
> triconsistent functions, which don't have access to the recheckCurItem
> flag.
As far as I recall, returning MAYBE for no MAYBE inputs in the
triConsistent function is equivalent to setting the recheck flag in
the bool consistent function. Thus, short-circuit case for no MAYBE
inputs should be broken.
> Anyway, I'm punting this to the code authors. I'd like to see
> some comments clarifying what the API really is, not just a
> quick-n-dirty code fix.
Yep, I'm going to have a closer look at this tomorrow.
------
Regards,
Alexander Korotkov
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2021-06-17 22:04:08 | Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows |
Previous Message | Tom Lane | 2021-06-17 21:51:18 | Re: BUG #17062: Assert failed in RemoveRoleFromObjectPolicy() on DROP OWNED policy applied to duplicate role |