Re: LWLock deadlock and gdb advice

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki <hlinnaka(at)iki(dot)fi>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LWLock deadlock and gdb advice
Date: 2015-07-29 16:23:32
Message-ID: CAMkU=1ypjDjdEbq+sCQMHQVPKS9EWZDNr7jFaDSDo1ekix9djQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 28, 2015 at 9:06 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Tue, Jul 28, 2015 at 7:06 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
>> Hi,
>>
>> On 2015-07-19 11:49:14 -0700, Jeff Janes wrote:
>> > After applying this patch to commit fdf28853ae6a397497b79f, it has
>> survived
>> > testing long enough to convince that this fixes the problem.
>>
>> What was the actual workload breaking with the bug? I ran a small
>> variety and I couldn't reproduce it yet. I'm not saying there's no bug,
>> I just would like to be able to test my version of the fixes...
>>
>
> It was the torn-page fault-injection code here:
>
>
> https://drive.google.com/open?id=0Bzqrh1SO9FcEfkxFb05uQnJ2cWg0MEpmOXlhbFdyNEItNmpuek1zU2gySGF3Vk1oYXNNLUE
>
> It is not a minimal set, I don't know if all parts of this are necessary
> to rerproduce it. The whole crash-recovery cycling might not even be
> important.
>

I've reproduced it again against commit b2ed8edeecd715c8a23ae462.

It took 5 hours on a 8 core "Intel(R) Xeon(R) CPU E5-2650".

I also reproduced it in 3 hours on the same machine with both JJ_torn_page
and JJ_xid set to zero (i.e. turned off, no induced crashes), so the
fault-injection patch should not be necessary to get the issue..

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-07-29 16:26:15 Re: LWLock deadlock and gdb advice
Previous Message Tom Lane 2015-07-29 16:17:29 Re: dblink: add polymorphic functions.