From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager |
Date: | 2020-03-10 10:41:34 |
Message-ID: | CAA4eK1+E8Vu=9PYZBZvMrga0Ynz_m6jmT3G_vJv-3L1PWv9Krg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Feb 24, 2020 at 3:38 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Feb 20, 2020 at 8:06 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > What I'm advocating is that extension locks should continue to go
> > through lock.c. And yes, that requires some changes to group locking,
> > but I still don't see why they'd be complicated.
> >
>
> Fair position, as per initial analysis, I think if we do below three
> things, it should work out without changing to a new way of locking
> for relation extension or page type locks.
> a. As per the discussion above, ensure in code we will never try to
> acquire another heavy-weight lock after acquiring relation extension
> or page type locks (probably by having Asserts in code or maybe some
> other way).
I have done an analysis of the relation extension lock (which can be
acquired via LockRelationForExtension or
ConditionalLockRelationForExtension) and found that we don't acquire
any other heavyweight lock after acquiring it. However, we do
sometimes try to acquire it again in the places where we update FSM
after extension, see points (e) and (f) described below. The usage of
this lock can be broadly divided into six categories and each one is
explained as follows:
a. Where after taking the relation extension lock we call ReadBuffer
(or its variant) and then LockBuffer. The LockBuffer internally calls
either LWLock to acquire or release neither of which acquire another
heavy-weight lock. It is quite obvious as well that while taking some
lightweight lock, there is no reason to acquire another heavyweight
lock on any object. The specs/comments of ReadBufferExtended (which
gets called from variants of ReadBuffer) API says that if the blknum
requested is P_NEW, only one backend can call it at-a-time which
indicates that we don't need to acquire any heavy-weight lock inside
this API. Otherwise, also, this API won't need a heavy-weight lock to
read the existing block into shared buffer as two different backends
are allowed to read the same block. I have also gone through all the
functions called/used in this path to ensure that we don't use
heavy-weight locks inside it.
The usage by APIs BloomNewBuffer, GinNewBuffer, gistNewBuffer,
_bt_getbuf, and SpGistNewBuffer falls in this category. Another API
that falls under this category is revmap_physical_extend which uses
ReadBuffer, LocakBuffer and ReleaseBuffer. The ReleaseBuffer API
unpins aka decrement the reference count for buffer and disassociates
a buffer from the resource owner. None of that requires heavy-weight
lock. T
b. After taking relation extension lock, we call
RelationGetNumberOfBlocks which primarily calls file-level functions
to determine the size of the file. This doesn't acquire any other
heavy-weight lock after relation extension lock.
The usage by APIs ginvacuumcleanup, gistvacuumscan, btvacuumscan, and
spgvacuumscan falls in this category.
c. There is a usage in API brin_page_cleanup() where we just acquire
and release the relation extension lock to avoid reinitializing the
page. As there is no call in-between acquire and release, so there is
no chance of another heavy-weight lock acquire after having relation
extension lock.
d. In fsm_extend() and vm_extend(), after acquiring relation extension
lock, we perform various file-level operations like RelationOpenSmgr,
smgrexists, smgrcreate, smgrnblocks, smgrextend. First, from theory,
we don't have any heavy-weight lock other than relation extension lock
which can cover such operations and then I have verified it by going
through these APIs that these don't acquire any other heavy-weight
lock. Then these APIs also call PageSetChecksumInplace computes a
checksum of the page and sets the same in page header which is quite
straight-forward and doesn't acquire any heavy-weight lock.
In vm_extend, we additionally call CacheInvalidateSmgr to send a
shared-inval message to force other backends to close any smgr
references they may have for the relation for which we extending
visibility map which has no reason to acquire any heavy-weight lock.
I have checked the code path as well and I didn't find any
heavy-weight lock call in that.
e. In brin_getinsertbuffer, we call ReadBuffer() and LockBuffer(), the
usage of which is the same as what is mentioned in (a). In addition
to that it calls brin_initialize_empty_new_buffer() which further
calls RecordPageWithFreeSpace which can again acquire relation
extension lock for same relation. This usage is safe because we have
a mechanism in heavy-weight lock manager that if we already hold a
lock and a request came for the same lock and in same mode, the lock
will be granted.
f. In RelationGetBufferForTuple(), there are multiple APIs that get
called and like (e), it can try to reacquire the relation extension
lock in one of those APIs. The main APIs it calls after acquiring
relation extension lock are described as follows:
- GetPageWithFreeSpace: This tries to find a page in the given
relation with at least the specified amount of free space. This
mainly checks the FSM pages and in one of the paths might call
fsm_extend which can again try to acquire the relation extension lock
on the same relation.
- RelationAddExtraBlocks: This adds multiple pages in a relation if
there is contention around relation extension lock. This calls
RelationExtensionLockWaiterCount which is mainly to check how many
lockers are waiting for the same lock, then call ReadBufferBI which as
explained above won't require heavy-weight locks and FSM APIs which
can acquire Relation extension lock on the same relation, but that is
safe as discussed previously.
The Page locks can be acquired via LockPage and ConditionalLockPage.
This is acquired from one place in the code during Gin index cleanup
(ginInsertCleanup). The basic idea is that it will scan the pending
list and move entries into the main index. While moving entries to
the main page, it might need to add a new page that will require us to
take a relation extension lock. Now, unlike relation extension lock,
after acquiring page lock, we do acquire another heavy-weight lock
(relation extension lock), but as we never acquire it in reverse
order, this is safe.
So, as per this analysis, we can add Asserts for relation extension
and page locks which will indicate that they won't participate in
deadlocks. It would be good if someone else can also do independent
analysis and verify my findings.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Hugh McMaster | 2020-03-10 10:53:14 | [PATCH] Use PKG_CHECK_MODULES to detect the libxml2 library |
Previous Message | Alexander Korotkov | 2020-03-10 10:05:40 | Re: [Patch] pg_rewind: options to use restore_command from recovery.conf or command line |