From: | "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com> |
Subject: | RE: Perform streaming logical transactions by background workers and parallel apply |
Date: | 2022-09-29 09:50:01 |
Message-ID: | TYAPR01MB58664D1D5B40689769386873F5579@TYAPR01MB5866.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Hou,
Thanks for updating patch. I will review yours soon, but I reply to your comment.
> > 04. applyparallelworker.c - LogicalParallelApplyLoop()
> >
> > ```
> > + shmq_res = shm_mq_receive(mqh, &len, &data, false);
> > ...
> > + if (ConfigReloadPending)
> > + {
> > + ConfigReloadPending = false;
> > + ProcessConfigFile(PGC_SIGHUP);
> > + }
> > ```
> >
> >
> > Here the parallel apply worker waits to receive messages and after dispatching
> > it ProcessConfigFile() is called.
> > It means that .conf will be not read until the parallel apply worker receives new
> > messages and apply them.
> >
> > It may be problematic when users change log_min_message to debugXXX for
> > debugging but the streamed transaction rarely come.
> > They expected that detailed description appears on the log from next
> > streaming chunk, but it does not.
> >
> > This does not occur in leader worker when it waits messages from publisher,
> > because it uses libpqrcv_receive(), which works asynchronously.
> >
> > I 'm not sure whether it should be documented that the evaluation of GUCs may
> > be delayed, how do you think?
>
> I changed the shm_mq_receive to asynchronous mode which is also consistent
> with
> what we did for Gather node when reading data from parallel query workers.
I checked your implementation, but it seemed that the parallel apply worker will not sleep
even if there are no messages or signals. It might be very inefficient.
In gather node - gather_readnext(), the same way is used, but I think there is a premise
that the wait-time is short because it is related with only one gather node.
In terms of parallel apply worker, however, we cannot predict the wait-time because
it is related with the streamed transactions. If such transactions rarely come, parallel apply workers may spend many CPU time.
I think we should wait during short time or until leader notifies, if shmq_res == SHM_MQ_WOULD_BLOCK.
How do you think?
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Verite | 2022-09-29 11:10:43 | [patch] \g with multiple result sets and \watch with copy queries |
Previous Message | Martin Kalcher | 2022-09-29 09:39:28 | Re: [PATCH] Introduce array_shuffle() and array_sample() |