From: | Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, ashwinstar(at)gmail(dot)com |
Subject: | Re: A micro-optimisation for ProcSendSignal() |
Date: | 2021-07-24 05:26:17 |
Message-ID: | CAE-ML+__zDoUfwh9ntAhwMF=rJkPoK3iF=RY5Du05FSmHidBEw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
HI Thomas,
On Tue, Jul 20, 2021 at 10:40 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > Slightly tangential: we should add a comment to PGPROC.pgprocno, for more
> > immediate understandability:
> >
> > + int pgprocno; /* index of this PGPROC in ProcGlobal->allProcs */
>
> I wonder why we need this member anyway, when you can compute it from
> the address... #define GetPGProcNumber(p) ((p) - ProcGlobal->allProcs)
> or something like that? Kinda wonder why we don't use
> GetPGProcByNumber() in more places instead of open-coding access to
> ProcGlobal->allProcs, too...
I tried this out. See attached v4 of your patch with these changes.
> > Also, why don't we take the opportunity to get rid of SERIALIZABLEXACT->pid? We
> > took a stab. Attached is v2 of your patch with these changes.
>
> SERIALIZABLEXACT objects can live longer than the backends that
> created them. They hang around to sabotage other transactions' plans,
> depending on what else they overlapped with before they committed.
> With that change, the pg_locks view might show the pid of some
> unrelated session that moves into the same PGPROC.
I see.
>
> It's only an "informational" pid, and pids are imperfect information
> anyway because (1) they are themselves recycled, and (2) they won't be
> interesting in a hypothetical multi-threaded future. One solution
> would be to hide the pids from the view after the backend disconnects
> (somehow -- add a generation number?), but they're also still kinda
> useful, despite the weaknesses. I wonder what the ideal way would be
> to refer to sessions, anyway, including those that are no longer
> active; perhaps we could invent a new "session ID" concept.
Updating the pg_locks view:
Yes, the pids may be valuable for future debugging/audit purposes. Also,
systems where pid_max is high enough to not see wraparound, will have
pids that are not recycled. I would lean towards showing the pid even
after the backend has exited.
Perhaps we could overload the stored pid to be negated (i.e. a backend
with pid 20000 will become -20000) to indicate that the pid belongs to
a backend that has exited. Additionally, we could introduce a boolean
field in pg_locks "backendisalive", so that the end user doesn't have
to reason about negative pids.
Session ID:
Interesting, Greenplum uses the concept of session ID pervasively. Being
a distributed database, the session ID helps to tie individual backends
on each PG instance to the same session. The session ID of course comes
with its share of bookkeeping:
* These IDs are incrementally dished out with a counter
(with pg_atomic_add_fetch_u32), in increments of 1, on the Greenplum
coordinator PG instance in InitProcess.
* The counter is a part of ProcGlobal and itself is initialized to 0 in
InitProcGlobal, which means that session IDs are reset on cluster
restart.
* The sessionID tied to each proceess is maintained in PGPROC.
* The sessionID finds its way into PgBackendStatus, which is further
reported with pg_stat_get_activity.
A session ID seems a bit heavy just to indicate whether a backend has
exited.
Regards,
Soumyadeep
Attachment | Content-Type | Size |
---|---|---|
v4-0001-Optimize-ProcSendSignal.patch | text/x-patch | 19.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2021-07-24 05:39:53 | proposal: plpgsql: special comments that will be part of AST |
Previous Message | Robert Haas | 2021-07-24 01:02:29 | Re: Followup Timestamp to timestamp with TZ conversion |