From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Perform streaming logical transactions by background workers and parallel apply |
Date: | 2022-12-19 12:47:25 |
Message-ID: | CAA4eK1KdVRffYEYiv9=4tVWhP=Jkfh1h6FM0A1R0NEe8cLT8+g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Dec 17, 2022 at 7:34 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Agreed. I have addressed all the comments and did some cosmetic changes.
> Attach the new version patch set.
>
Few comments:
============
1.
+ if (fileset_state == FS_SERIALIZE_IN_PROGRESS)
+ {
+ pa_lock_stream(MyParallelShared->xid, AccessShareLock);
+ pa_unlock_stream(MyParallelShared->xid, AccessShareLock);
+ }
+
+ /*
+ * We cannot read the file immediately after the leader has serialized all
+ * changes to the file because there may still be messages in the memory
+ * queue. We will apply all spooled messages the next time we call this
+ * function, which should ensure that there are no messages left in the
+ * memory queue.
+ */
+ else if (fileset_state == FS_SERIALIZE_DONE)
+ {
Once we have waited in the FS_SERIALIZE_IN_PROGRESS, the file state
can be FS_SERIALIZE_DONE immediately after that. So, won't it be
better to have a separate if block for FS_SERIALIZE_DONE state? If you
agree to do so then we can probably remove the comment: "* XXX It is
possible that immediately after we have waited for a lock in ...".
2.
+void
+pa_decr_and_wait_stream_block(void)
+{
+ Assert(am_parallel_apply_worker());
+
+ if (pg_atomic_sub_fetch_u32(&MyParallelShared->pending_stream_count, 1) == 0)
I think here the count can go negative when we are in serialize mode
because we don't increase it for serialize mode. I can't see any
problem due to that but OTOH, this doesn't seem to be intended because
in the future if we decide to implement the functionality of switching
back to non-serialize mode, this could be a problem. Also, I guess we
don't even need to try locking/unlocking the stream lock in that case.
One idea to avoid this is to check if the pending count is zero then
if file_set in not available raise an error (elog ERROR), otherwise,
simply return from here.
3. In apply_handle_stream_stop(), we are setting backendstate as idle
for cases TRANS_LEADER_SEND_TO_PARALLEL and TRANS_PARALLEL_APPLY. For
other cases, it is set by stream_stop_internal. I think it would be
better to set the state explicitly for all cases to make the code look
consistent and remove it from stream_stop_internal(). The other reason
to remove setting the state from stream_stop_internal() is that when
that function is invoked from other places like
apply_handle_stream_commit(), it seems to be setting the idle before
actually we reach the idle state.
4. Apart from the above, I have made a few changes in the comments,
see attached.
--
With Regards,
Amit Kapila.
Attachment | Content-Type | Size |
---|---|---|
changes_amit_1.patch | application/octet-stream | 4.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Ajin Cherian | 2022-12-19 13:19:14 | Re: Support logical replication of DDLs |
Previous Message | Ted Yu | 2022-12-19 12:44:15 | Re: On login trigger: take three |