From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Nasty btree deletion bug |
Date: | 2006-10-26 16:31:50 |
Message-ID: | 21633.1161880310@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
> [ looks at that for a bit... ] Yeah, you're right. Once the deletion
> is completed, the F lower-bound key will disappear from the grandparent,
> which would restore consistency --- but we could have already delivered
> wrong search answers, so that won't do.
On further reflection, I think I understand why we've not realized the
existence of this bug before: in fact, it *doesn't* lead to wrong search
answers. I think the only visible consequence is exactly the "failed to
re-find parent key" VACUUM error that Ed saw. The reason is that the
key misordering in the grandparent level is nearly harmless. Using your
example of
- F D D ...
* if we happen to come across the F key first during a binary search of
the grandparent page, and we are looking for something <= F, we will
descend to its left, which is at worst a little bit inefficient:
_bt_moveright will still ensure that we find what we seek.
* if we happen to visit one of the D key(s) first, and we are looking
for something > D, we will descend to the right of that key. Well,
that's not incorrect for the live data. In fact, the *only* key in the
tree that we will fail to find this way is the F bounding key for the
half-dead page itself (or one of its also-deletable parents). So that's
exactly why VACUUM can fail while trying to clean up the half-dead page,
and it's why we're not seeing reports of wrong query answers.
So that reduces the priority of the bug quite a lot in my estimation;
and makes me not want to incur a lot of additional code and locking to
fix it. I'm wondering whether we can simply adopt a modified strategy
for searching for a half-dead page's parent during _bt_pagedel.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2006-10-26 16:35:11 | Re: Out of memory error causes Abort, Abort tries to |
Previous Message | Heikki Linnakangas | 2006-10-26 16:31:30 | Re: Nasty btree deletion bug |