Re: _bt_split(), and the risk of OOM before its critical section

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: _bt_split(), and the risk of OOM before its critical section
Date: 2019-05-07 00:15:30
Message-ID: CAH2-Wzk667fF-Fa3HKu=qNV8ymjB3XTYnQNUEsGLgKA7aOWtjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 6, 2019 at 4:11 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> The important question is how VACUUM will recognize it. It's clearly
> not as bad as something that causes "failed to re-find parent key"
> errors, but I think that VACUUM might not be reclaiming it for the FSM
> (haven't checked). Note that _bt_unlink_halfdead_page() is perfectly
> happy to ignore the fact that the left sibling of a half-dead page has
> a rightlink that doesn't point back to the target. Because, uh, there
> might have been a concurrent page deletion, somehow.

VACUUM asserts P_FIRSTDATAKEY(opaque) > PageGetMaxOffsetNumber(page)
within _bt_mark_page_halfdead(), but doesn't test that condition in
release builds. This means that the earliest modifications of the
right page, before the high key PageAddItem(), are enough to cause a
subsequent "failed to re-find parent key" failure in VACUUM. Merely
setting the sibling blocks in the right page special area is enough to
cause VACUUM to refuse to run.

Of course, the problem goes away if you restart the database, because
the right page buffer is never marked dirty, and never can be. That
factor would probably make the problem appear to be an intermittent
issue in the kinds of environments where it is most likely to be seen.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-05-07 00:26:09 Re: _bt_split(), and the risk of OOM before its critical section
Previous Message Paul Jungwirth 2019-05-06 23:21:45 Re: range_agg