From: | James Lucas <jlucasdba(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop |
Date: | 2020-08-17 16:21:42 |
Message-ID: | CAAFmbbOnCtds-Q5vOAmTMBm5sAvBpQhc474zq+LMCidSjgt11A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Forgot to say, I don't think I can run bt_index_parent_check() right
now due to the broader locks required. I will try to get a run in if
I get an opportunity.
Thanks,
James
On Mon, Aug 17, 2020 at 10:51 AM James Lucas <jlucasdba(at)gmail(dot)com> wrote:
>
> Hi Peter,
>
> I re-ran with DEBUG2 messages enabled. Got a bunch of output, but the
> last few lines are like this for each index:
>
> DEBUG: level 965868789 leftmost page of index "xxxxx" was found
> deleted or half dead
> DETAIL: Deleted page found when building scankey from right sibling.
> DEBUG: level 966240004 leftmost page of index "xxxxx" was found
> deleted or half dead
> DETAIL: Deleted page found when building scankey from right sibling.
> ERROR: cross page item order invariant violated for index "xxxxx"
> DETAIL: Last item on page tid=(xx,xx) page lsn=xxxxxxxxxx
>
> DEBUG: level 967745369 leftmost page of index "xxxxx" was found
> deleted or half dead
> DETAIL: Deleted page found when building scankey from right sibling.
> DEBUG: level 967746918 leftmost page of index "xxxxx" was found
> deleted or half dead
> DETAIL: Deleted page found when building scankey from right sibling.
> ERROR: cross page item order invariant violated for index "xxxxx"
> DETAIL: Last item on page tid=(xx,xx) page lsn=xxxxxxxxxx
>
>
> Not sure if pageinspect might be able to tell anything else useful?
> I'd like to find the root cause of the corruption if possible, so this
> doesn't happen in other databases.
>
> Also wanted to see if it might be a good idea to add a
> CHECK_FOR_INTERRUPTS call to _bt_moveright() so if this does happen
> again, at least the session would be killable. I don't have enough
> background in the code to know where it's safe to add, or I'd submit a
> patch.
>
> Thanks,
> James
>
> On Fri, Aug 14, 2020 at 4:33 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> >
> > On Fri, Aug 14, 2020 at 2:03 PM PG Bug reporting form
> > <noreply(at)postgresql(dot)org> wrote:
> > > The table has two indexes, so I decided to scan both indexes on all
> > > partitions with the bt_index_check function from the amcheck extension. I
> > > identified one partition where both indexes throw the following result:
> > > ERROR: cross page item order invariant violated for index "xxxxx"
> > > DETAIL: Last item on page tid(xx,xx) page lsn=xxxxxxxxxx
> >
> > This sounds very much like an index with sibling pages that are in the
> > wrong order relative to each other. That's totally consistent with
> > what you describe with _bt_moveright() -- circular sibling links can
> > cause it to just keep going.
> >
> > It's possible that you'll get a better error with
> > bt_index_parent_check(), which might be worth trying. But it probably
> > won't give you any additional information.
> >
> > Note that there is DEBUG1 and DEBUG2 output from amcheck, which might
> > give you a few more details. You can "set client_min_messages =
> > 'debug2'" in an interactive session that runs bt_index_check() to see
> > some additional context. Again, this is unlikely to make all that much
> > difference.
> >
> > --
> > Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2020-08-17 16:55:09 | Re: Weird behaviour after update from 12.2 to 12.3 version |
Previous Message | James Lucas | 2020-08-17 15:51:35 | Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop |