Re: regression, deadlock in high frequency single-row UPDATE

From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andrew Sackville-West <awest(at)janrain(dot)com>, pgsql-bugs(at)postgresql(dot)org, Paulo Tanimoto <paulo(at)janrain(dot)com>
Subject: Re: regression, deadlock in high frequency single-row UPDATE
Date: 2014-12-10 21:56:39
Message-ID: 5488C197.3060900@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 11/12/14 08:33, Alvaro Herrera wrote:
> Mark Kirkwood wrote:
>
>> Not so much advantages - just seeing if I could still reproduce the issue
>> with a simpler test case i.e ensuring you were not doing anything 'odd' -
>> and you are not :-)
>>
>> The next step would be trying to figure out what commit introduced this
>> behaviour - but depesz has already beaten me to that (nice work)!
>
> So I traced through the problem using the simplified test case I posted.
> It goes like this:
>
> * tuple on the referenced table has been updated a bunch of times
> already, and keeps being updated, so there's a large update chain to
> follow.
>
> * two or more transactions (T1 and T2, say) have snapshots in which
> some version of the tuple (not the last one) is visible.
>
> * transaction T3 has a ForKeyShare lock on the latest version of the
> tuple, which is grabbed because of the INSERT in the referencing
> table. Note that these locks are propagated upwards in an update
> chain, so even if T3 locked an old version, all future versions are
> also locked by T3.
>
> * each of T1 and T2 tries to update the version they have visible; this
> fails because it's not the last version, so they need to go through
> EvalPlanQual, which walks through the update chain.
>
> At this point, T1 passes through heap_lock_tuple, gets the HW tuple
> lock, marks itself as locker in the tuple's Xmax. T1 is scheduled out;
> time for T2 to run. T2 goes through heap_lock_tuple, grabs HW lock;
> examines Xmax, sees that T1 is the locker, goes to sleep until T1
> awakes. T1 is scheduled to run again, runs heap_update. The first
> thing it needs is to do LockTuple ... but T2 is holding that one, so T1
> goes to sleep until T2 awakes -- now they are both sleeping on each
> other.
>
> Kaboom.
>
> I'm not seeing a solution here. There is something wrong in having to
> release the HW lock then grab it again, in that EvalPlanQual dance. It
> sounds like we need to hold on to the tuple's HW lock until after
> heap_update has run, but it's not at all clear how this would work --
> the interface through EvalPlanQual is messy enough as it is without
> entertaining the idea that a "please keep HW lock until later" flag
> needs to be passed up and down all over the place.
>
> One simple idea that occurs to me is have some global state in heapam.c;
> when the EPQ code is invoked from ExecUpdate, we tell heapam to keep the
> HW lock, and it's only released once heap_update is finished. That
> would keep T2 away from the tuple. We'd need a PG_TRY block in
> ExecUpdate to reset the global state once the update is done. (It's not
> entirely clear that we need all this for deletes too. The main
> difference is that after a delete completes the tuple is gone, which is
> obviously not the case in an update. Also, deletes don't create update
> chains.)
>
> One thing I haven't thought too much about is why doesn't this happen in
> 9.2.
>

Hmmm - interesting, yeah I think the why-it-doesn't-in-9.2 part is the
next thing to understand :-)

Cheers

Mark

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Mark Kirkwood 2014-12-11 07:12:07 Re: regression, deadlock in high frequency single-row UPDATE
Previous Message Jeff Frost 2014-12-10 21:38:13 Re: BUG #12187: Cant find the postgresql-client installation by yum install