From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Heikki <hlinnaka(at)iki(dot)fi> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: LWLock deadlock and gdb advice |
Date: | 2015-07-19 19:23:38 |
Message-ID: | CAMkU=1zUc=h0oCZntaJaqqW7gxxVxCWsYq8DD2t7oHgsgVEsgA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jul 16, 2015 at 12:03 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
> wrote:
>
>>
>> Both. Here's the patch.
>>
>> Previously, LWLockAcquireWithVar set the variable associated with the
>> lock atomically with acquiring it. Before the lwlock-scalability changes,
>> that was straightforward because you held the spinlock anyway, but it's a
>> lot harder/expensive now. So I changed the way acquiring a lock with a
>> variable works. There is now a separate flag, LW_FLAG_VAR_SET, which
>> indicates that the current lock holder has updated the variable. The
>> LWLockAcquireWithVar function is gone - you now just use LWLockAcquire(),
>> which always clears the LW_FLAG_VAR_SET flag, and you can call
>> LWLockUpdateVar() after that if you want to set the variable immediately.
>> LWLockWaitForVar() always waits if the flag is not set, i.e. it will not
>> return regardless of the variable's value, if the current lock-holder has
>> not updated it yet.
>>
>>
> I ran this for a while without casserts and it seems to work. But with
> casserts, I get failures in the autovac process on the GIN index.
>
> I don't see how this is related to the LWLock issue, but I didn't see it
> without your patch. Perhaps the system just didn't survive long enough to
> uncover it without the patch (although it shows up pretty quickly). It
> could just be an overzealous Assert, since the casserts off didn't show
> problems.
>
> bt and bt full are shown below.
>
> Cheers,
>
> Jeff
>
> #0 0x0000003dcb632625 in raise () from /lib64/libc.so.6
> #1 0x0000003dcb633e05 in abort () from /lib64/libc.so.6
> #2 0x0000000000930b7a in ExceptionalCondition (
> conditionName=0x9a1440 "!(((PageHeader) (page))->pd_special >=
> (__builtin_offsetof (PageHeaderData, pd_linp)))", errorType=0x9a12bc
> "FailedAssertion",
> fileName=0x9a12b0 "ginvacuum.c", lineNumber=713) at assert.c:54
> #3 0x00000000004947cf in ginvacuumcleanup (fcinfo=0x7fffee073a90) at
> ginvacuum.c:713
>
It now looks like this *is* unrelated to the LWLock issue. The assert that
it is tripping over was added just recently (302ac7f27197855afa8c) and so I
had not been testing under its presence until now. It looks like it is
finding all-zero pages (index extended but then a crash before initializing
the page?) and it doesn't like them.
(gdb) f 3
(gdb) p *(char[8192]*)(page)
$11 = '\000' <repeats 8191 times>
Presumably before this assert, such pages would just be permanently
orphaned.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2015-07-19 19:39:11 | Re: Implementation of global temporary tables? |
Previous Message | Pavel Stehule | 2015-07-19 19:08:37 | Re: pg_dump quietly ignore missing tables - is it bug? |