Re: Deadlock in multiple CIC.

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeremy Finzel <finzelj(at)gmail(dot)com>
Subject: Re: Deadlock in multiple CIC.
Date: 2017-12-26 17:23:32
Message-ID: CAMkU=1wpDwfPkaG8bqqs2piv9N+9-uwvpjzYYkNPtr4RLA0WEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 26, 2017 at 8:31 AM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:

> Jeff Janes wrote:
> > c3d09b3bd23f5f6 fixed it so concurrent CIC would not deadlock (or at
> least
> > not as reliably as before) by dropping its own snapshot before waiting
> for
> > all the other ones to go away.
> >
> > With commit 8aa3e47510b969354ea02a, concurrent CREATE INDEX CONCURRENTLY
> on
> > different tables in the same database started deadlocking against each
> > other again quite reliably.
> >
> > I think the solution is simply to drop the catalog snapshot as well, as
> in
> > the attached.
>
> Thanks for the analysis -- it sounds reasonable to me. However, I'm
> wondering why you used the *Conditionally() version instead of plain
> InvalidateCatalogSnapshot().

My thinking was that if there was for some reason another snapshot hanging
around, that dropping the catalog snapshot unconditionally would be a
correctness bug, while doing it conditionally would just fail to avoid a
theoretically avoidable deadlock. So it seemed safer.

> I think they must have the same effect in
> practice (the assumption being that you can't run CIC in a transaction
> that may have other snapshots) but the theory seems simpler when calling
> the other routine: just do away with the snapshot always, period.
>

That is probably true. But I never even knew that catalog snapshots
existed until yesterday, so didn't want to make make assumptions about what
else might exist, to avoid introducing new bugs similar to the one that
8aa3e47510b969354ea02a fixed.

>
> This is back-patchable to 9.4, first branch which has MVCC catalog
> scans. It's strange that this has gone undetected for so long.

Since the purpose of CIC is to build an index with minimal impact on other
users, I think wanting to use it in concurrent cases might be rather rare.
In a maintenance window, I wouldn't want to use CIC because it is slower
and I'd rather just hold the stronger lock and do it fast, and as a hot-fix
outside a maintenance window I usually wouldn't want to hog the CPU with
concurrent builds when I could do them sequentially instead. Also, since
deadlocks are "expected" errors rather than "should never happen" errors,
and since the docs don't promise that you can do parallel CIC without
deadlocks, many people would probably shrug it off (which I initially did)
rather than report it as a bug. I was looking into it as an enhancement
rather than a bug until I discovered that it was already enhanced and then
undone.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-12-26 17:42:56 Re: General purpose hashing func in pgbench
Previous Message Alvaro Herrera 2017-12-26 16:55:57 Re: [PROPOSAL] Shared Ispell dictionaries