From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Perform streaming logical transactions by background workers and parallel apply |
Date: | 2022-12-27 05:16:52 |
Message-ID: | CAA4eK1LbKORmo3n1iFV+qKmeiuHvvn4U2i9KGfg11b-QE5AUHQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 27, 2022 at 10:36 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Dec 27, 2022 at 9:15 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Dec 26, 2022 at 7:35 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > In the commit message, there is a statement like this
> > >
> > > "However, if the leader apply worker times out while attempting to
> > > send a message to the
> > > parallel apply worker, it will switch to "partial serialize" mode - in this
> > > mode the leader serializes all remaining changes to a file and notifies the
> > > parallel apply workers to read and apply them at the end of the transaction."
> > >
> > > I think it is a good idea to serialize the change to the file in this
> > > case to avoid deadlocks, but why does the parallel worker need to wait
> > > till the transaction commits to reading the file? I mean we can
> > > switch the serialize state and make a parallel worker pull changes
> > > from the file and if the parallel worker has caught up with the
> > > changes then it can again change the state to "share memory" and now
> > > the apply worker can again start sending through shared memory.
> > >
> > > I think generally streaming transactions are large and it is possible
> > > that the shared memory queue gets full because of a lot of changes for
> > > a particular transaction but later when the load switches to the other
> > > transactions then it would be quite common for the worker to catch up
> > > with the changes then it better to again take advantage of using
> > > memory. Otherwise, in this case, we are just wasting resources
> > > (worker/shared memory queue) but still writing in the file.
> > >
> >
> > Note that there is a certain threshold timeout for which we wait
> > before switching to serialize mode and normally it happens only when
> > PA starts waiting on some lock acquired by the backend. Now, apart
> > from that even if we decide to switch modes, the current BufFile
> > mechanism doesn't have a good way for that. It doesn't allow two
> > processes to open the same buffile at the same time which means we
> > need to maintain multiple files to achieve the mode where we can
> > switch back from serialize mode. We cannot let LA wait for PA to close
> > the file as that could introduce another kind of deadlock. For
> > details, see the discussion in the email [1]. The other problem is
> > that we have no way to deal with partially sent data via a shared
> > memory queue. Say, if we timeout while sending the data, we have to
> > resend the same message until it succeeds which will be tricky because
> > we can't keep retrying as that can lead to deadlock. I think if we try
> > to build this new mode, it will be a lot of effort without equivalent
> > returns. In common cases, we didn't see that we time out and switch to
> > serialize mode. It is mostly in cases where PA starts to wait for the
> > lock acquired by other backend or the machine is slow enough to deal
> > with the number of parallel apply workers. So, it doesn't seem worth
> > adding more complexity to the first version but we don't rule out the
> > possibility of the same in the future if we really see such cases are
> > common.
> >
> > [1] - https://www.postgresql.org/message-id/CAD21AoDScLvLT8JBfu5WaGCPQs_qhxsybMT%2BsMXJ%3DQrDMTyr9w%40mail.gmail.com
>
> Okay, I see. And once we change to serialize mode we can't release
> the worker as well because we have already applied partial changes
> under some transaction from a PA so we can not apply remaining from
> the LA. I understand it might introduce a lot of complex design to
> change it back to parallel apply mode but my only worry is that in
> such cases we will be holding on to the parallel worker just to wait
> till commit to reading from the spool file. But as you said it should
> not be very common case so maybe this is fine.
>
Right and as said previously if required (which is not clear at this
stage) we can develop it in the later version as well.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2022-12-27 05:24:18 | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Previous Message | Dilip Kumar | 2022-12-27 05:06:29 | Re: Perform streaming logical transactions by background workers and parallel apply |