From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: some review comments on logical rep code |
Date: | 2017-04-18 08:16:58 |
Message-ID: | CAD21AoB1HiZV++ah=VrQjVZoH+b6Rn8Atv6Miz+HCp0j+L6GZQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Hi,
>
> Thank you for the revised version.
>
> At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoA+yFJTbw0PCw-ttmh9TsTygM3=iavXF+5Bx4O9-UXdsQ(at)mail(dot)gmail(dot)com>
>> On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI
>> > <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoB3L50jWwuyaGLxJzmrmXmYJAOBHzDPR=FKwjw29mxj0Q(at)mail(dot)gmail(dot)com>
>> >>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> >>> 1.
>> >>> >
>> >>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer
>> >>> > value from the argument, instead of DatumGetObjectId().
>> >>>
>> >>> Attached 001 patch fixes it.
>> >>
>> >> Hmm. I looked at the launcher side and found that it assigns bare
>> >> integer to bgw_main_arg. It also should use Int32GetDatum.
>> >
>> > Yeah, I agreed. Will fix it.
>
> Thanks.
>
>> >>> 2.
>> >>> >
>> >>> > No one resets on_commit_launcher_wakeup flag to false.
>> >>>
>> >>> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke
>> >>> up the launcher process.
>> >>
>> >> It is omitting the abort case. Maybe it should be
>> >> AtEOXact_ApplyLauncher(bool commit)?
>> >
>> > Should we wake up the launcher process even when abort?
>
> No, I meant that on_commit_launcher_wakeup should be just reset
> without waking up launcher on abort.
I understood. Sounds reasonable.
>
>> >>> 3.
>> >>> >
>> >>> > ApplyLauncherWakeup() should be static function.
>> >>>
>> >>> Attached 003 patch fixes it (and also fixes #5 review comment).
>> >>>
>> >>> 4.
>> >>> >
>> >>> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's
>> >>> > subcommands (e.g., ENABLED one) should wake the launcher up on commit.
>> >>>
>> >>> This is also reported by Horiguchi-san on another thread[1], and patch exists.
>> >>>
>> >>> 5.
>> >>> >
>> >>> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()?
>> >>>
>> >>> I also guess it's necessary. This change is included in #3 patch.
>> >>
>> >> IsBackendPid() is not free, I suppose that just ignoring failure
>> >> with ESRCH is enough.
>> >
>> > After thought, since LogicalRepCtx->launcher_pid could be 0 should, we
>> > check if launcher_pid != 0?
>
> Yes. For example, code to send a signal to autovacuum launcher
> looks as the follows. I don't think there's a reason to do
> different thing here.
>
> | avlauncher_needs_signal = false;
> | if (AutoVacPID != 0)
> | kill(AutoVacPID, SIGUSR2);
>
Fixed.
>
>> >>> 6.
>> >>> >
>> >>> > Normal statements that the walsender for logical rep runs are logged
>> >>> > by log_replication_commands. So if log_statement is also set,
>> >>> > those commands are logged twice.
>> >>>
>> >>> Attached 006 patch fixes it by adding "-c log_statement = none" to
>> >>> connection parameter of libpqrcv if logical = true.
>> >>
>> >> The control by log_replication_commands is performed on
>> >> walsender, so this also shuld be done on the same place. Anyway
>> >> log_statement is irrelevant to walsender.
>> >
>> > Patch 006 emits all logs executed by the apply worker as a log of
>> > log_replication_command but perhaps you're right. It would be better
>> > to leave the log of log_statement if the command executed by the apply
>> > worker is a SQL command. Will fix.
>
> Here, we can choose whether a SQL command is a part of
> replication commands or not. The previous fix thought it is and
> this doesn't. (They are emitted in different forms) Is this ok?
Yes, v2 patch logs other than T_SQLCmd as a replication command, and
T_SQLCmd is logged later in exec_simple_query. The
log_replication_command logs only replication commands[1], it doesn't
mean to log commands executed on replication connection. I think such
behavior is right.
> I'm not sure why you have moved the location of the code, but if
> it should show all received commands, it might be better to be
> placed before CHECK_FOR_INTERRUPTS..
Hmm, the reason why I've moved it is that we cannot know whether given
cmd_string is a SQL command or a replication command before parsing.
>
> The comment for the code seems should be more clearly.
>
> - * compatibility. To prevent the command is logged doubly, we doesn't
> - * log it when the command is a SQL command.
> + * compatibility. SQL command are logged later according
> + * to log_statement setting.
Fixed.
>> >>> 7.
>> >>> >
>> >>> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits
>> >>> > an error. The callback function to reset it needs to be registered
>> >>> > via on_shmem_exit().
>> >>>
>> >>> Attached 007 patch adds callback function to reset LogicalRepCtx->launcher_pid.
>> >>>
>> >>> 8.
>> >>> >
>> >>> > Typo: "an subscriber" should be "a subscriber" in some places.
>> >>>
>> >>> It seems that there is no longer these typo.
>> >>>
>> >>> 9.
>> >>> > /* Get conninfo */
>> >>> > datum = SysCacheGetAttr(SUBSCRIPTIONOID,
>> >>> > tup,
>> >>> > Anum_pg_subscription_subconninfo,
>> >>> > &isnull);
>> >>> > Assert(!isnull);
>> >>> >
>> >>> > This assertion makes me think that pg_subscription.subconnifo should
>> >>> > have NOT NULL constraint. Also subslotname and subpublications
>> >>> > should have NOT NULL constraint because of the same reason.
>> >>>
>> >>> Agreed. Attached 009 patch add NOT NULL constraint to all other
>> >>> columns of pg_subscription. pg_subscription.subsynccommit should have
>> >>> it as well.
>> >>
>> >> I'm not sure the policy here, but I suppose that the assertions
>> >> there are still required irrelevantly from the nun-nullness of
>> >> the attribute.
>> >
>> > You're right. Will fix it.
>> >
>> >>> 10.
>> >>> >
>> >>> > SpinLockAcquire(&MyLogicalRepWorker->relmutex);
>> >>> > MyLogicalRepWorker->relstate =
>> >>> > GetSubscriptionRelState(MyLogicalRepWorker->subid,
>> >>> > MyLogicalRepWorker->relid,
>> >>> > &MyLogicalRepWorker->relstate_lsn,
>> >>> > false);
>> >>> > SpinLockRelease(&MyLogicalRepWorker->relmutex);
>> >>> >
>> >>> > Non-"short-term" function like GetSubscriptionRelState() should not
>> >>> > be called while spinlock is being held.
>> >>> >
>> >>>
>> >>> One option is adding new LWLock for the relation state change but this
>> >>> lock is used at many locations where the operation actually doesn't
>> >>> take time. I think that the discussion would be needed to get
>> >>> consensus, so patch for it is not attached.
>> >>
>> >> From the point of view of time, I agree that it doesn't seem to
>> >> harm. Bt I thing it as more significant problem that
>> >> potentially-throwable function is called within the mutex
>> >> region. It potentially causes a kind of dead lock. (That said,
>> >> theoretically dosn't occur and I'm not sure what happens by the
>> >> dead lock..)
>> >>
>> >>
>> >>> [1] https://www.postgresql.org/message-id/20170406.212441.177025736.horiguchi.kyotaro@lab.ntt.co.jp
>> >>
>> >> --
>> >> Kyotaro Horiguchi
>> >> NTT Open Source Software Center
>> >>
>> >
>> >
>>
>> I've attached latest v2 three patches; 001, 006 and 009. The review
>> comments I got so far are incorporated.
>
> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
Attached current latest all patches.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
006_Prevent_to_emit_statement_log_double_v3.patch | application/octet-stream | 1.5 KB |
001_change_DatumGetObjectId_to_DatumGetInt32_v3.patch | application/octet-stream | 1.1 KB |
002_Reset_on_commit_launcher_wakeup_v3.patch | application/octet-stream | 1.7 KB |
009_Add_not_null_constraint_v3.patch | application/octet-stream | 1.0 KB |
007_Regiter_on_shmem_exit_for_launcher_v3.patch | application/octet-stream | 1.4 KB |
003_005_Fix_ApplyLancherWakeUp_function_v3.patch | application/octet-stream | 1.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nikhil Sontakke | 2017-04-18 08:17:28 | Re: Failed recovery with new faster 2PC code |
Previous Message | Simon Riggs | 2017-04-18 08:15:48 | Re: scram and \password |