Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Date: 2020-02-12 16:53:49
Message-ID: 2707.1581526429@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> On Wed, Feb 12, 2020 at 7:36 AM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>> On Wed, 12 Feb 2020 at 00:43, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I would like to suggest that we do something similar to Robert Haas'
>>> excellent hack (daa7527af) for the !HAVE_SPINLOCK case in lmgr/spin.c,

>> My original proposal used LWLocks and hash tables for relation
>> extension but there was a discussion that using LWLocks is not good
>> because it's not interruptible[1].

> Hmm, but we use LWLocks for (a) WALWrite/Flush (see the usage of
> WALWriteLock), (b) writing the shared buffer contents (see
> io_in_progress lock and its usage in FlushBuffer) and might be for few
> other similar stuff. Many times those take more time than extending a
> block in relation especially when we combine the WAL write for
> multiple commits. So, if this is a problem for relation extension
> lock, then the same thing holds true there also.

Yeah. I would say a couple more things:

* I see no reason to think that a relation extension lock would ever
be held long enough for noninterruptibility to be a real issue. Our
expectations for query cancel response time are in the tens to
hundreds of msec anyway.

* There are other places where an LWLock can be held for a *long* time,
notably the CheckpointLock. If we do think this is an issue, we could
devise a way to not insist on noninterruptibility. The easiest fix
is just to do a matching RESUME_INTERRUPTS after getting the lock and
HOLD_INTERRUPTS again before releasing it; though maybe it'd be worth
offering some slightly cleaner way. Point here is that LWLockAcquire
only does that because it's useful to the majority of callers, not
because it's graven in stone that it must be like that.

In general, if we think there are issues with LWLock, it seems to me
we'd be better off to try to fix them, not to invent a whole new
single-purpose lock manager that we'll have to debug and maintain.
I do not see anything about this problem that suggests that that would
provide a major win. As Andres has noted, there are lots of other
aspects of it that are likely to be more useful to spend effort on.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2020-02-12 16:59:05 Bug in pg_restore with EventTrigger in parallel mode
Previous Message Alvaro Herrera 2020-02-12 16:30:25 Re: Getting rid of some more lseek() calls