From: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Expose lock group leader pid in pg_stat_activity |
Date: | 2020-02-04 14:27:25 |
Message-ID: | 20200204142725.GA30965@nol |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jan 30, 2020 at 10:03:01PM +0900, Michael Paquier wrote:
> On Tue, Jan 28, 2020 at 02:52:08PM +0100, Tomas Vondra wrote:
> > On Tue, Jan 28, 2020 at 02:26:34PM +0100, Julien Rouhaud wrote:
> >> There were already some dependencies between the rows since parallel
> >> queries were added, as you could see eg. a parallel worker while no
> >> query is currently active. This patch will make those corner cases
> >> more obvious.
>
> I was reviewing the code and one thing that I was wondering is if it
> would be better to make the code more defensive and return NULL when
> the PID of the referenced leader is 0 or InvalidPid. However that
> would mean that we have a dummy 2PC entry from PGPROC or something not
> yet started, which makes no sense. So your simpler version is
> actually fine. What you have here is that in the worst case you could
> finish with an incorrect reference to shared memory if a PGPROC is
> recycled between the moment you look for the leader field and the
> moment the PID value is fetched. That's very unlikely to happen, and
> I agree that it does not really justify the cost of taking extra locks
> every time we scan pg_stat_activity.
Ok.
>
> > Yeah, sure. I mean explicit dependencies, e.g. a column referencing
> > values from another row, like leader_id does.
>
> + The leader_pid is NULL for processes not involved in parallel query.
> This is missing two markups, one for "NULL" and a second for
> "leader_pid".
The extra "leader_pid" disappeared when I reworked the doc. I'm not sure what
you meant here for NULL as I don't see any extra markup used in closeby
documentation, so I hope this version is ok.
> The documentation does not match the surroundings
> either, so I would suggest a reformulation for the beginning:
> PID of the leader process if this process is involved in parallel query.
> And actually this paragraph is not completely true, because leader_pid
> remains set even after one parallel query run has been finished for a
> session when leader_pid = pid as lockGroupLeader is set to NULL only
> once the process is stopped in ProcKill().
Oh good point, that's unfortunately not a super friendly behavior. I tried to
adapt the documentation to address of all that. It's maybe slightly too
verbose, but I guess that extra clarity is welcome here.
> >> Should I document the possible inconsistencies?
> >
> > I think it's worth mentioning that as a comment in the code, say before
> > the pg_stat_get_activity function. IMO we don't need to document all
> > possible inconsistencies, a generic explanation is enough.
>
> Agreed that adding some information in the area when we look at the
> PGPROC entries for wait events and such would be nice.
I added some code comments to remind that we don't guarantee any consistency
here.
Attachment | Content-Type | Size |
---|---|---|
pgsa-leader-pid-v5.diff | text/plain | 10.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | 曾文旌 (义从) | 2020-02-04 15:01:37 | Re: [Proposal] Global temporary tables |
Previous Message | Rémi Lapeyre | 2020-02-04 13:25:03 | [PATCH v1] Allow COPY "text" format to output a header |