Quick Links

Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject:	Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
Date:	2022-08-11 21:12:46
Message-ID:	20220811211246.fyalckv3y6tizfwj@awork3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2022-08-10 14:52:36 +0530, Amit Kapila wrote:
> I think this could be the probable reason for failure though I didn't
> try to debug/reproduce this yet. AFAIU, this is possible during
> recovery/replay of WAL record XLOG_HASH_SPLIT_ALLOCATE_PAGE as via
> XLogReadBufferForRedoExtended, we can mark the buffer dirty while
> restoring from full page image. OTOH, because during normal operation
> we didn't mark the page dirty SyncOneBuffer would have skipped it due
> to check (if (!(buf_state & BM_VALID) || !(buf_state & BM_DIRTY))).

I think there might still be short-lived references from other paths, even if
not marked dirty, but it isn't realy important.

> > I assume this is trying to defend against some sort of deadlock by not
> > actually getting a cleanup lock (by passing get_cleanup_lock = true to
> > XLogReadBufferForRedoExtended()).
> >
>
> IIRC, this is just following what we do during normal operation and
> based on the theory that the meta-page is not updated yet so no
> backend will access it. I think we can do what you wrote unless there
> is some other reason behind this failure.

Well, it's not really the same if you silently continue in normal operation
and PANIC during recovery... If it's an optional operation the tiny race
around not getting the cleanup lock is fine, but it's a totally different
story during recovery.

Greetings,

Andres Freund

In response to

Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock at 2022-08-10 09:22:36 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Souvik Bhattacherjee	2022-08-11 21:42:15	Reducing planning time of large IN queries on primary key / unique columns
Previous Message	Zhihong Yu	2022-08-11 20:57:54	Re: avoid negating LONG_MIN in cash_out()