Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
Date: 2022-10-13 20:55:35
Message-ID: 20221013205535.cknud3qrq6woe2o7@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-10-06 12:44:24 +0530, Amit Kapila wrote:
> On Sat, Oct 1, 2022 at 12:35 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > My problem with this approach is that the whole cleanup lock is hugely
> > misleading as-is. As I noted in
> > https://www.postgresql.org/message-id/20220817193032.z35vdjhpzkgldrd3%40awork3.anarazel.de
> > we take the cleanup lock *after* re-initializing the page. Thereby
> > completely breaking the properties that a cleanup lock normally tries to
> > guarantee.
> >
> > Even if that were to achieve something useful (doubtful in this case),
> > it'd need a huge comment explaining what's going on.
> >
>
> Attached are two patches. The first patch is what Robert has proposed
> with some changes in comments to emphasize the fact that cleanup lock
> on the new bucket is just to be consistent with the old bucket page
> locking as we are initializing it just before checking for cleanup
> lock. In the second patch, I removed the acquisition of cleanup lock
> on the new bucket page and changed the comments/README accordingly.
>
> I think we can backpatch the first patch and the second patch can be
> just a HEAD-only patch. Does that sound reasonable to you?

Not particularly, no. I don't understand how "overwrite a page and then get a
cleanup lock" can sensibly be described by this comment:

> +++ b/src/backend/access/hash/hashpage.c
> @@ -807,7 +807,8 @@ restart_expand:
> * before changing the metapage's mapping info, in case we can't get the
> * disk space. Ideally, we don't need to check for cleanup lock on new
> * bucket as no other backend could find this bucket unless meta page is
> - * updated. However, it is good to be consistent with old bucket locking.
> + * updated and we initialize the page just before it. However, it is just
> + * to be consistent with old bucket locking.
> */
> buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM);
> if (!IsBufferCleanupOK(buf_nblkno))

This is basically saying "I am breaking basic rules of locking just to be
consistent", no?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-10-13 21:00:52 Re: libpq support for NegotiateProtocolVersion
Previous Message Erik Wienhold 2022-10-13 20:07:33 Re: Exponentiation confusion