Re: Index corruption with CREATE INDEX CONCURRENTLY

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index corruption with CREATE INDEX CONCURRENTLY
Date: 2017-02-06 04:11:00
Message-ID: CAA4eK1KXTN=on==QTp9=d=evz-7iVyDshYOu9jEQqujPknXgzA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 6, 2017 at 8:35 AM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> wrote:
>
>
> On Mon, Feb 6, 2017 at 8:15 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> On Mon, Feb 6, 2017 at 8:01 AM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
>> wrote:
>> >
>> >
>> > On Mon, Feb 6, 2017 at 1:44 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> >>
>> >>
>> >>
>> >> > 2. In the second patch, we tried to recompute attribute lists if a
>> >> > relcache
>> >> > flush happens in between and index list is invalidated. We've seen
>> >> > problems
>> >> > with that, especially it getting into an infinite loop with
>> >> > CACHE_CLOBBER_ALWAYS. Not clear why it happens, but apparently
>> >> > relcache
>> >> > flushes keep happening.
>> >>
>> >> Well, yeah: the point of CLOBBER_CACHE_ALWAYS is to make a relcache
>> >> flush
>> >> happen whenever it possibly could.
>> >
>> >
>> > Ah, ok. That explains why the retry logic as originally proposed was
>> > causing
>> > infinite loop.
>> >
>> >>
>> >> The way to deal with that without
>> >> looping is to test whether the index set *actually* changed, not
>> >> whether
>> >> it just might have changed.
>> >
>> >
>> > I like this approach. I'll run the patch on a build with
>> > CACHE_CLOBBER_ALWAYS, but I'm pretty sure it will be ok. I also like the
>> > fact that retry logic is now limited to only when index set changes,
>> > which
>> > fits in the situation we're dealing with.
>> >
>>
>> Don't you think it also has the same problem as mentioned by me in
>> Alvaro's patch and you also agreed on it?
>
>
> No, I don't think so. There can't be any more relcache flushes occurring
> between the first RelationGetIndexAttrBitmap() and the subsequent
> RelationGetIndexAttrBitmap() calls. The problem with the earlier proposed
> approach was that we were not caching, yet returning stale information. That
> implied the next call will again recompute a possibly different value. The
> retry logic is trying to close a small race yet important race condition
> where index set invalidation happens, but we cache stale information.
>

Hmm. Consider that the first time relcahe invalidation occurs while
computing id_attrs, so now the retry logic will compute the correct
set of attrs (considering two indexes, if we take the example given by
Alvaro above.). However, the attrs computed for hot_* and key_* will
be using only one index, so this will lead to a situation where two of
the attrs (hot_attrs and key_attrs) are computed with one index and
id_attrs is computed with two indexes. I think this can lead to
Assertion in HeapSatisfiesHOTandKeyUpdate().

>>
>> Do you see any need to fix
>> it locally in
>> RelationGetIndexAttrBitmap(), why can't we clear at transaction end?
>>
>
> We're trying to fix it in RelationGetIndexAttrBitmap() because that's where
> the race exists. I'm not saying there aren't other and better ways of
> handling it, but we don't know (I've studied Andres's proposal yet).
>

Okay, please find attached patch which is an extension of Tom's and
Andres's patch and I think it fixes the above problem noted by me.
Basically, invalidate the bitmaps at transaction end rather than in
RelationGetIndexAttrBitmap(). I think it is okay for
RelationGetIndexAttrBitmap() to use stale information until
transaction end because all the updates in the current running
transaction will be consumed by the second phase of CIC.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
invalidate_indexattr_v6.patch application/octet-stream 6.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-02-06 04:15:15 IF [NOT] EXISTS for replication slots
Previous Message Claudio Freire 2017-02-06 04:05:16 Re: ParallelFinish-hook of FDW/CSP (Re: Steps inside ExecEndGather)