Re:Re:Re:Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST

From: yuansong <yyuansong(at)126(dot)com>
To: "Peter Geoghegan" <pg(at)bowt(dot)ie>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re:Re:Re:Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST
Date: 2024-12-01 13:09:41
Message-ID: 39b023e9.1ed0.19382575335.Coremail.yyuansong@126.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

the _bt_binsrch_insert function always returns low, but during the post list search, are there cases where low and mid are unequal?

If so, this could potentially cause an offset in the subsequent _bt_insertonpg function.

maybe we fix it like this ?

OffsetNumber

_bt_binsrch_insert(Relation rel, BTInsertState insertstate)

{

......

while (high > low)

{

OffsetNumber mid = low + ((high - low) / 2);

/*

* If tuple at offset located by binary search is a posting list whose

* TID range overlaps with caller's scantid, perform posting list

* binary search to set postingoff for caller. Caller must split the

* posting list when postingoff is set. This should happen

* infrequently.

*/

if (unlikely(result == 0 && key->scantid != NULL))

{

/*

* postingoff should never be set more than once per leaf page

* binary search. That would mean that there are duplicate table

* TIDs in the index, which is never okay. Check for that here.

*/

if (insertstate->postingoff != 0)

ereport(ERROR,

(errcode(ERRCODE_INDEX_CORRUPTED),

errmsg_internal("table tid from new index tuple (%u,%u) cannot find insert offset between offsets %u and %u of block %u in index \"%s\"",

ItemPointerGetBlockNumber(key->scantid),

ItemPointerGetOffsetNumber(key->scantid),

low, stricthigh,

BufferGetBlockNumber(insertstate->buf),

RelationGetRelationName(rel))));

insertstate->postingoff = _bt_binsrch_posting(key, page, mid);

// Here, will low and mid ever be unequal? If low is returned in such cases, would it result in an error? maybe we fix it like this ?

// low = mid;

// break;

}

}

........

return low;

}

At 2024-11-27 18:53:20, "yuansong" <yyuansong(at)126(dot)com> wrote:

we find crash reson

We have identified the cause of the crash: it was due to the XLOG_BTREE_INSERT_POST XLOG having an OffsetNumber offnum that was one less than what was stored in the index. I experimented with adding +1, and the index data remained normal in both cases. This issue is likely caused by concurrent operations on the B-tree, and upon reviewing the corresponding WAL logs, we found SPLIT_L and INSERT_LEAF operations on the same block before the crash. This might be a bug. I'm not sure if there's a related fix.

At 2024-11-21 23:58:03, "Peter Geoghegan" <pg(at)bowt(dot)ie> wrote:
>On Thu, Nov 21, 2024 at 10:03 AM yuansong <yyuansong(at)126(dot)com> wrote:
>> Should nhtids be less than or equal to IndexTupleSize(oposting)?
>> Why is nhtids larger than IndexTupleSize(oposting) ? I think there should be an error in the master host writing the wal log.
>> Does anyone know when this will happen?
>
>It'll happen whenever there is a certain kind of data corruption.
>
>There were complaints about issues like this in the past. But those
>complaints seem to have gone away when more hardening was added to the
>code that runs during original execution (not the REDO routine code,
>which can only do what it is told to do by the WAL record).
>
>You're using PostgreSQL 13.2, which is a very old point release that
>lacks this hardening -- the current 13 point release is 13.18, so
>you're missing a lot. Had you been on a later point release you'd very
>probably have still had the issue with corruption (which could be from
>bad hardware), but you likely would have avoided the problem with the
>REDO routine crashing like this.
>
>--
>Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2024-12-01 13:33:28 Re: Re:Re:Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST
Previous Message Pavel Stehule 2024-11-29 15:48:56 Re: Bug report for plpgsql

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2024-12-01 13:33:28 Re: Re:Re:Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST
Previous Message Alexander Lakhin 2024-12-01 12:00:00 Re: Improving tracking/processing of buildfarm test failures