Re: Deadlock in multiple CIC.

From: Jeremy Finzel <finzelj(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deadlock in multiple CIC.
Date: 2017-12-27 16:50:19
Message-ID: CAMa1XUiXOw3mh7cJcN=pHHVbM1BcGx38z9imKCtPv5t1Xzg+QA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 26, 2017 at 11:23 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Tue, Dec 26, 2017 at 8:31 AM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> wrote:
>
>> Jeff Janes wrote:
>> > c3d09b3bd23f5f6 fixed it so concurrent CIC would not deadlock (or at
>> least
>> > not as reliably as before) by dropping its own snapshot before waiting
>> for
>> > all the other ones to go away.
>> >
>> > With commit 8aa3e47510b969354ea02a, concurrent CREATE INDEX
>> CONCURRENTLY on
>> > different tables in the same database started deadlocking against each
>> > other again quite reliably.
>> >
>> > I think the solution is simply to drop the catalog snapshot as well, as
>> in
>> > the attached.
>>
>> Thanks for the analysis -- it sounds reasonable to me. However, I'm
>> wondering why you used the *Conditionally() version instead of plain
>> InvalidateCatalogSnapshot().
>
>
> My thinking was that if there was for some reason another snapshot hanging
> around, that dropping the catalog snapshot unconditionally would be a
> correctness bug, while doing it conditionally would just fail to avoid a
> theoretically avoidable deadlock. So it seemed safer.
>
>
>> I think they must have the same effect in
>> practice (the assumption being that you can't run CIC in a transaction
>> that may have other snapshots) but the theory seems simpler when calling
>> the other routine: just do away with the snapshot always, period.
>>
>
> That is probably true. But I never even knew that catalog snapshots
> existed until yesterday, so didn't want to make make assumptions about what
> else might exist, to avoid introducing new bugs similar to the one that
> 8aa3e47510b969354ea02a fixed.
>
>
>>
>> This is back-patchable to 9.4, first branch which has MVCC catalog
>> scans. It's strange that this has gone undetected for so long.
>
>
> Since the purpose of CIC is to build an index with minimal impact on other
> users, I think wanting to use it in concurrent cases might be rather rare.
> In a maintenance window, I wouldn't want to use CIC because it is slower
> and I'd rather just hold the stronger lock and do it fast, and as a hot-fix
> outside a maintenance window I usually wouldn't want to hog the CPU with
> concurrent builds when I could do them sequentially instead. Also, since
> deadlocks are "expected" errors rather than "should never happen" errors,
> and since the docs don't promise that you can do parallel CIC without
> deadlocks, many people would probably shrug it off (which I initially did)
> rather than report it as a bug. I was looking into it as an enhancement
> rather than a bug until I discovered that it was already enhanced and then
> undone.
>
> Cheers,
>
> Jeff
>

I was able to get this compiled, and ran the test before on stock 9.6.6,
then on this patched version. I indeed reproduced it on 9.6.6, but on the
patched version, it indeed fixes my issue.

Let me know if I can be of further help.

Thanks,
Jeremy

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-12-27 17:31:05 Re: [HACKERS] generated columns
Previous Message Konstantin Knizhnik 2017-12-27 15:37:22 Re: AS OF queries