Re: Perform streaming logical transactions by background workers and parallel apply

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-16 21:42:57
Message-ID: CAHut+Psy9djSojhME3T_JjOQJsGiW-921Ruz7h2w1CTTMggg0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 16, 2023 at 5:41 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Jan 16, 2023 at 10:24 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > 2.
> >
> > /*
> > + * Return the pid of the leader apply worker if the given pid is the pid of a
> > + * parallel apply worker, otherwise return InvalidPid.
> > + */
> > +pid_t
> > +GetLeaderApplyWorkerPid(pid_t pid)
> > +{
> > + int leader_pid = InvalidPid;
> > + int i;
> > +
> > + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
> > +
> > + for (i = 0; i < max_logical_replication_workers; i++)
> > + {
> > + LogicalRepWorker *w = &LogicalRepCtx->workers[i];
> > +
> > + if (isParallelApplyWorker(w) && w->proc && pid == w->proc->pid)
> > + {
> > + leader_pid = w->leader_pid;
> > + break;
> > + }
> > + }
> > +
> > + LWLockRelease(LogicalRepWorkerLock);
> > +
> > + return leader_pid;
> > +}
> >
> > 2a.
> > IIUC the IsParallelApplyWorker macro does nothing except check that
> > the leader_pid is not InvalidPid anyway, so AFAIK this algorithm does
> > not benefit from using this macro because we will want to return
> > InvalidPid anyway if the given pid matches.
> >
> > So the inner condition can just say:
> >
> > if (w->proc && w->proc->pid == pid)
> > {
> > leader_pid = w->leader_pid;
> > break;
> > }
> >
>
> Yeah, this should also work but I feel the current one is explicit and
> more clear.

OK.

But, I have one last comment about this function -- I saw there are
already other functions that iterate max_logical_replication_workers
like this looking for things:
- logicalrep_worker_find
- logicalrep_workers_find
- logicalrep_worker_launch
- logicalrep_sync_worker_count

So I felt this new function (currently called GetLeaderApplyWorkerPid)
ought to be named similarly to those ones. e.g. call it something like
"logicalrep_worker_find_pa_leader_pid".

>
> > ~
> >
> > 2b.
> > A possible alternative comment.
> >
> > BEFORE
> > Return the pid of the leader apply worker if the given pid is the pid
> > of a parallel apply worker, otherwise return InvalidPid.
> >
> >
> > AFTER
> > If the given pid has a leader apply worker then return the leader pid,
> > otherwise, return InvalidPid.
> >
>
> I don't think that is an improvement.
>
> > ======
> >
> > src/backend/utils/adt/pgstatfuncs.c
> >
> > 3.
> >
> > @@ -434,6 +435,16 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
> > values[28] = Int32GetDatum(leader->pid);
> > nulls[28] = false;
> > }
> > + else
> > + {
> > + int leader_pid = GetLeaderApplyWorkerPid(beentry->st_procpid);
> > +
> > + if (leader_pid != InvalidPid)
> > + {
> > + values[28] = Int32GetDatum(leader_pid);
> > + nulls[28] = false;
> > + }
> > +
> >
> > 3a.
> > There is an existing comment preceding this if/else but it refers only
> > to leaders of parallel groups. Should that comment be updated to
> > mention the leader apply worker too?
> >
>
> Yeah, we can slightly adjust the comments. How about something like the below:
> index 415e711729..7eb668634a 100644
> --- a/src/backend/utils/adt/pgstatfuncs.c
> +++ b/src/backend/utils/adt/pgstatfuncs.c
> @@ -410,9 +410,9 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
>
> /*
> * If a PGPROC entry was retrieved, display
> wait events and lock
> - * group leader information if any. To avoid
> extra overhead, no
> - * extra lock is being held, so there is no guarantee of
> - * consistency across multiple rows.
> + * group leader or apply leader information if
> any. To avoid extra
> + * overhead, no extra lock is being held, so
> there is no guarantee
> + * of consistency across multiple rows.
> */
> if (proc != NULL)
> {
> @@ -428,7 +428,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
> /*
> * Show the leader only for active
> parallel workers. This
> * leaves the field as NULL for the
> leader of a parallel
> - * group.
> + * group or the leader of a parallel apply.
> */
> if (leader && leader->pid !=
> beentry->st_procpid)
>

The updated comment LGTM.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-01-16 21:48:28 Re: Extracting cross-version-upgrade knowledge from buildfarm client
Previous Message Maciek Sakrejda 2023-01-16 21:41:49 Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)