Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-16 06:40:50
Message-ID: CAA4eK1JA-=pSNmeCsjegnQBb00Q3U4ZovGbfHqK_V7R527QMQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 16, 2023 at 10:24 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> 2.
>
> /*
> + * Return the pid of the leader apply worker if the given pid is the pid of a
> + * parallel apply worker, otherwise return InvalidPid.
> + */
> +pid_t
> +GetLeaderApplyWorkerPid(pid_t pid)
> +{
> + int leader_pid = InvalidPid;
> + int i;
> +
> + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
> +
> + for (i = 0; i < max_logical_replication_workers; i++)
> + {
> + LogicalRepWorker *w = &LogicalRepCtx->workers[i];
> +
> + if (isParallelApplyWorker(w) && w->proc && pid == w->proc->pid)
> + {
> + leader_pid = w->leader_pid;
> + break;
> + }
> + }
> +
> + LWLockRelease(LogicalRepWorkerLock);
> +
> + return leader_pid;
> +}
>
> 2a.
> IIUC the IsParallelApplyWorker macro does nothing except check that
> the leader_pid is not InvalidPid anyway, so AFAIK this algorithm does
> not benefit from using this macro because we will want to return
> InvalidPid anyway if the given pid matches.
>
> So the inner condition can just say:
>
> if (w->proc && w->proc->pid == pid)
> {
> leader_pid = w->leader_pid;
> break;
> }
>

Yeah, this should also work but I feel the current one is explicit and
more clear.

> ~
>
> 2b.
> A possible alternative comment.
>
> BEFORE
> Return the pid of the leader apply worker if the given pid is the pid
> of a parallel apply worker, otherwise return InvalidPid.
>
>
> AFTER
> If the given pid has a leader apply worker then return the leader pid,
> otherwise, return InvalidPid.
>

I don't think that is an improvement.

> ======
>
> src/backend/utils/adt/pgstatfuncs.c
>
> 3.
>
> @@ -434,6 +435,16 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
> values[28] = Int32GetDatum(leader->pid);
> nulls[28] = false;
> }
> + else
> + {
> + int leader_pid = GetLeaderApplyWorkerPid(beentry->st_procpid);
> +
> + if (leader_pid != InvalidPid)
> + {
> + values[28] = Int32GetDatum(leader_pid);
> + nulls[28] = false;
> + }
> +
>
> 3a.
> There is an existing comment preceding this if/else but it refers only
> to leaders of parallel groups. Should that comment be updated to
> mention the leader apply worker too?
>

Yeah, we can slightly adjust the comments. How about something like the below:
index 415e711729..7eb668634a 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -410,9 +410,9 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)

/*
* If a PGPROC entry was retrieved, display
wait events and lock
- * group leader information if any. To avoid
extra overhead, no
- * extra lock is being held, so there is no guarantee of
- * consistency across multiple rows.
+ * group leader or apply leader information if
any. To avoid extra
+ * overhead, no extra lock is being held, so
there is no guarantee
+ * of consistency across multiple rows.
*/
if (proc != NULL)
{
@@ -428,7 +428,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
/*
* Show the leader only for active
parallel workers. This
* leaves the field as NULL for the
leader of a parallel
- * group.
+ * group or the leader of a parallel apply.
*/
if (leader && leader->pid !=
beentry->st_procpid)

> ~
>
> 3b.
> It may be unrelated to this patch, but it seems strange to me that the
> nulls[28]/values[28] assignments are done where they are. Every other
> nulls/values assignment of this function here is pretty much in the
> correct numerical order except this one, so IMO this code ought to be
> relocated to later in this same function.
>

This is not related to the current patch but I see there is merit in
the current coding as it is better to retrieve all the fields of proc
together.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhang Mingli 2023-01-16 06:57:55 Re: Code review in dsa.c
Previous Message David G. Johnston 2023-01-16 06:33:56 Re: If there are more than two functions in different schemas, the functions have the same name and same arguments, \df[+] only display the function that schema first appeared in the search_path.