From: | "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com> |
---|---|
To: | <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add LWLock blocker(s) information |
Date: | 2020-06-07 06:12:59 |
Message-ID: | 1881eef5-df4f-bbfa-b03f-0a15d9fab55e@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi hackers,
On 6/2/20 2:24 PM, Drouvot, Bertrand wrote:
>
> Hi hackers,
>
>
> I've attached a patch to add blocker(s) information for LW Locks.
> The motive behind is to be able to get some blocker(s) information (if
> any) in the context of LW Locks.
>
> _Motivation:_
>
> We have seen some cases with heavy contention on some LW Locks (large
> number of backends waiting on the same LW Lock).
>
> Adding some blocker information would make the investigations easier,
> it could help answering questions like:
>
> * how many PIDs are holding the LWLock (could be more than one in
> case of LW_SHARED)?
> * Is the blocking PID changing?
> * Is the number of blocking PIDs changing?
> * What is the blocking PID doing?
> * Is the blocking PID waiting?
> * In which mode request is the blocked PID?
> * in which mode is the blocker PID holding the lock?
>
> _Technical context and proposal:_
>
> There is 2 points in this patch:
>
> * Add the instrumentation:
>
> * the patch adds into the LWLock struct:
>
> last_holding_pid: last pid owner of the lock
> last_mode: last holding mode of the last pid owner
> of the lock
> nholders: number of holders (could be >1 in case
> of LW_SHARED)
>
> * the patch adds into the PGPROC struct:
>
> //lwLastHoldingPid: last holder of the LW lock the PID is waiting for
> lwHolderMode; LW lock mode of last holder of the
> LW lock the PID is waiting for
> lwNbHolders: number of holders of the LW lock the
> PID is waiting for
>
> and what is necessary to update this new information.
>
> * Provide a way to display the information: the patch also adds a
> function /pg_lwlock_blocking_pid/ to display this new information.
>
> _Outcome Example:_
>
> # select * from pg_lwlock_blocking_pid(10259);
> requested_mode | last_holder_pid | last_holder_mode | nb_holders
> ----------------+-----------------+------------------+------------
> LW_EXCLUSIVE | 10232 | LW_EXCLUSIVE | 1
> (1 row)
>
> # select query,pid,state,wait_event,wait_event_type,pg_lwlock_blocking_pid(pid),pg_blocking_pids(pid) from pg_stat_activity where state='active' and pid != pg_backend_pid();
> query | pid | state | wait_event | wait_event_type | pg_lwlock_blocking_pid | pg_blocking_pids
> --------------------------------+-------+--------+---------------+-----------------+-------------------------------------------+------------------
> insert into bdtlwa values (1); | 10232 | active | | | (,,,) | {}
> insert into bdtlwb values (1); | 10254 | active | WALInsert | LWLock | (LW_WAIT_UNTIL_FREE,10232,LW_EXCLUSIVE,1) | {}
> create table bdtwt (a int); | 10256 | active | WALInsert | LWLock | (LW_WAIT_UNTIL_FREE,10232,LW_EXCLUSIVE,1) | {}
> insert into bdtlwa values (2); | 10259 | active | BufferContent | LWLock | (LW_EXCLUSIVE,10232,LW_EXCLUSIVE,1) | {}
> drop table bdtlwd; | 10261 | active | WALInsert | LWLock | (LW_WAIT_UNTIL_FREE,10232,LW_EXCLUSIVE,1) | {}
> (5 rows)
>
>
> So, should a PID being blocked on a LWLock we could see:
>
> * in which mode request it is waiting
> * the last pid holding the lock
> * the mode of the last PID holding the lock
> * the number of PID(s) holding the lock
>
> _Remarks:_
>
> I did a few benchmarks so far and did not observe notable performance
> degradation (can share more details if needed).
>
> I did some quick attempts to get an exhaustive list of blockers (in
> case of LW_SHARED holders), but I think that would be challenging as:
>
> * There is about 40 000 calls to LWLockInitialize and all my
> attempts to init a list here produced “ FATAL: out of shared
> memory” or similar.
> * One way to get rid of using a list in LWLock could be to use
> proc_list (with proclist_head in LWLock and proclist_node in
> PGPROC). This is the current implementation for the “waiters”
> list. But this would not work for the blockers as one PGPROC can
> hold multiples LW locks so it could mean having a list of about
> 40K proclist_node per PGPROC.
> * I also have concerns about possible performance impact by using
> such a huge list in this context.
>
> Those are the reasons why this patch does not provide an exhaustive
> list of blockers.
>
> While this patch does not provide an exhaustive list of blockers (in
> case of LW_SHARED holders), the information it delivers could already
> be useful to get insights during LWLock contention scenario.
>
> I will add this patch to the next commitfest. I look forward to your
> feedback about the idea and/or implementation.
>
> Regards,
>
> Bertrand
Attaching a new version of the patch with a tiny change to make it pass
the regression tests (opr_sanity was failing due to the new function
that is part of the patch).
Regards,
Bertrand
Attachment | Content-Type | Size |
---|---|---|
v1-0002-Add-LWLock-blockers-info.patch | text/plain | 11.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2020-06-07 06:51:22 | Re: [PATCH] Keeps tracking the uniqueness with UniqueKey |
Previous Message | Tom Lane | 2020-06-07 04:23:35 | valgrind versus pg_atomic_init() |