From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock |
Date: | 2022-08-17 18:36:23 |
Message-ID: | 20220817183623.w3fsoerpaunt7exe@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2022-08-17 10:18:14 +0530, Amit Kapila wrote:
> > Looking at the non-recovery code makes me even more suspicious:
> >
> > /*
> > * Physically allocate the new bucket's primary page. We want to do this
> > * before changing the metapage's mapping info, in case we can't get the
> > * disk space. Ideally, we don't need to check for cleanup lock on new
> > * bucket as no other backend could find this bucket unless meta page is
> > * updated. However, it is good to be consistent with old bucket locking.
> > */
> > buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM);
> > if (!IsBufferCleanupOK(buf_nblkno))
> > {
> > _hash_relbuf(rel, buf_oblkno);
> > _hash_relbuf(rel, buf_nblkno);
> > goto fail;
> > }
> >
> >
> > _hash_getnewbuf() calls _hash_pageinit() which calls PageInit(), which
> > memset(0)s the whole page. What does it even mean to check whether you
> > effectively have a cleanup lock after you zeroed out the page?
> >
> > Reading the README and the comment above makes me wonder if this whole cleanup
> > lock business here is just cargo culting and could be dropped?
> >
>
> I think it is okay to not acquire a clean-up lock on the new bucket
> page both in recovery and non-recovery paths. It is primarily required
> on the old bucket page to avoid concurrent scans/inserts. As mentioned
> in the comments and as per my memory serves, it is mainly for keeping
> it consistent with old bucket locking.
It's not keeping it consistent with bucket locking to zero out a page before
getting a cleanup lock, hopefully at least. This code is just broken on
multiple fronts, and consistency isn't a defense.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2022-08-17 18:45:34 | Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock |
Previous Message | Alvaro Herrera | 2022-08-17 18:24:30 | Re: cataloguing NOT NULL constraints |