All-zero page in GIN index causes assertion failure

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: All-zero page in GIN index causes assertion failure
Date: 2015-07-20 08:14:05
Message-ID: 55ACADCD.6020206@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is a continuation of the discussion at
http://www.postgresql.org/message-id/CAMkU=1zUc=h0oCZntaJaqqW7gxxVxCWsYq8DD2t7oHgsgVEsgA@mail.gmail.com,
I'm starting a new thread as this is a separate issue than the original
LWLock bug.

> On Thu, Jul 16, 2015 at 12:03 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
>> On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
>> wrote:
>>
>> I don't see how this is related to the LWLock issue, but I didn't see it
>> without your patch. Perhaps the system just didn't survive long enough to
>> uncover it without the patch (although it shows up pretty quickly). It
>> could just be an overzealous Assert, since the casserts off didn't show
>> problems.
>
>> bt and bt full are shown below.
>>
>> Cheers,
>>
>> Jeff
>>
>> #0 0x0000003dcb632625 in raise () from /lib64/libc.so.6
>> #1 0x0000003dcb633e05 in abort () from /lib64/libc.so.6
>> #2 0x0000000000930b7a in ExceptionalCondition (
>> conditionName=0x9a1440 "!(((PageHeader) (page))->pd_special >=
>> (__builtin_offsetof (PageHeaderData, pd_linp)))", errorType=0x9a12bc
>> "FailedAssertion",
>> fileName=0x9a12b0 "ginvacuum.c", lineNumber=713) at assert.c:54
>> #3 0x00000000004947cf in ginvacuumcleanup (fcinfo=0x7fffee073a90) at
>> ginvacuum.c:713
>>
>
> It now looks like this *is* unrelated to the LWLock issue. The assert that
> it is tripping over was added just recently (302ac7f27197855afa8c) and so I
> had not been testing under its presence until now. It looks like it is
> finding all-zero pages (index extended but then a crash before initializing
> the page?) and it doesn't like them.
>
> (gdb) f 3
> (gdb) p *(char[8192]*)(page)
> $11 = '\000' <repeats 8191 times>
>
> Presumably before this assert, such pages would just be permanently
> orphaned.

Yeah, so it seems. It's normal to have all-zero pages in the index, if
you crash immediately after the relation has been extended, but before
the new page has been WAL-logged. What is your test case like; did you
do crash-testing?

ISTM ginvacuumcleanup should check for PageIsNew, and put the page to
the FSM. That's what btvacuumpage() gistvacuumcleanup() do.
spgvacuumpage() seems to also check for PageIsNew(), but it seems broken
in a different way: it initializes the page and marks the page as dirty,
but it is not WAL-logged. That is a problem at least if checksums are
enabled: if you crash you might have a torn page on disk, with invalid
checksum.

- Heikki

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-07-20 08:41:46 Re: [COMMITTERS] pgsql: Retain comments on indexes and constraints at ALTER TABLE ... TY
Previous Message Haribabu Kommi 2015-07-20 08:06:32 Re: Parallel Seq Scan