From: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
Cc: | "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add Information during standby recovery conflicts |
Date: | 2020-10-30 03:25:10 |
Message-ID: | c141f3fc-2cff-a8f5-08b5-0aafa173c949@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020/10/30 10:29, Masahiko Sawada wrote:
> ,
>
> On Thu, 29 Oct 2020 at 00:16, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>
>>
>>
>> On 2020/10/27 9:41, Masahiko Sawada wrote:
>>> On Tue, 20 Oct 2020 at 22:02, Drouvot, Bertrand <bdrouvot(at)amazon(dot)com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 10/15/20 9:15 AM, Masahiko Sawada wrote:
>>>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>>>>
>>>>>
>>>>>
>>>>> On Thu, 15 Oct 2020 at 14:52, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>>>>>> At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote in
>>>>>>>> ereport(..(errmsg("%s", _("hogehoge")))) results in
>>>>>>>> fprintf((translated("%s")), translate("hogehoge")).
>>>>>>>>
>>>>>>>> So your change (errmsg("%s", gettext_noop("hogehoge")) results in
>>>>>>>>
>>>>>>>> fprintf((translated("%s")), DONT_translate("hogehoge")).
>>>>>>>>
>>>>>>>> which leads to a translation problem.
>>>>>>>>
>>>>>>>> (errmsg(gettext_noop("hogehoge"))
>>>>>>> This seems equivalent to (errmsg("hogehoge")), right?
>>>>>> Yes and no. However eventually the two works the same way,
>>>>>> "(errmsg(gettext_noop("hogehoge"))" is a shorthand of
>>>>>>
>>>>>> 1: char *msg = gettext_noop("hogehoge");
>>>>>> ...
>>>>>> 2: .. (errmsg(msg));
>>>>>>
>>>>>> That is, the line 1 only registers a message id "hogehoge" and doesn't
>>>>>> translate. The line 2 tries to translate the content of msg and it
>>>>>> finds the translation for the message id "hogehoge".
>>>>> Understood.
>>>>>
>>>>>>> I think I could understand translation stuff. Given we only report the
>>>>>>> const string returned from get_recovery_conflict_desc() without
>>>>>>> placeholders, the patch needs to use errmsg_internal() instead while
>>>>>>> not changing _() part. (errmsg(get_recovery_conflict_desc())) is not
>>>>>>> good (warned by -Wformat-security).
>>>>>> Ah, right. we get a complain if no value parameters added. We can
>>>>>> silence it by adding a dummy parameter to errmsg, but I'm not sure
>>>>>> which is preferable.
>>>>> Okay, I'm going to use errmsg_internal() for now until a better idea comes.
>>>>>
>>>>> I've attached the updated patch that fixed the translation part.
>>>>
>>>> Thanks for reviewing and helping on this patch!
>>>>
>>>> The patch tester bot is currently failing due to:
>>>>
>>>> "proc.c:1290:5: error: ‘standbyWaitStart’ may be used uninitialized in
>>>> this function [-Werror=maybe-uninitialized]"
>>>>
>>>> I've attached a new version with the minor change to fix it.
>>>>
>>>
>>> Thank you for updating the patch!
>>>
>>> I've looked at the patch and revised a bit the formatting etc.
>>>
>>> After some thoughts, I think it might be better to report the waiting
>>> time as well. it would help users and there is no particular reason
>>> for logging the report only once. It also helps make the patch clean
>>> by reducing the variables such as recovery_conflict_logged. I’ve
>>> implemented it in the v8 patch.
>>
>> I read v8 patch. Here are review comments.
>
> Thank you for your review.
>
>>
>> When recovery conflict with buffer pin happens, log message is output
>> every deadlock_timeout. Is this intentional behavior? If yes, IMO that's
>> not good because lots of messages can be output.
>
> Agreed.
>
> if we were to log the recovery conflict only once in bufferpin
> conflict case, we would log it multiple times only in lock conflict
> case. So I guess it's better to do that in all conflict cases.
Yes, I agree that this behavior basically should be consistent between all cases.
>
>>
>> + if (log_recovery_conflict_waits)
>> + waitStart = GetCurrentTimestamp();
>>
>> What happens if log_recovery_conflict_waits is off at the beginning and
>> then it's changed during waiting for the conflict? In this case, waitStart is
>> zero, but log_recovery_conflict_waits is on. This may cause some problems?
>
> Hmm, I didn't see a path that happens to reload the config file during
> waiting for buffer cleanup lock. Even if the startup process receives
> SIGHUP during that, it calls HandleStartupProcInterrupts() at the next
> convenient time. It could be the beginning of main apply loop or
> inside WaitForWALToBecomeAvailable() and so on but I didn’t see it in
> the wait loop for taking a buffer cleanup.
Yes, you're right. I seem to have read the code wrongly.
> However, I think it’s
> better to use (waitStart > 0) for safety when checking if we log the
> recovery conflict instead of log_recovery_conflict_waits.
>
>>
>> + if (report_waiting)
>> + ts = GetCurrentTimestamp();
>>
>> GetCurrentTimestamp() doesn't need to be called every cycle
>> in the loop after "logged" is true and "new_status" is not NULL.
>
> Agreed
>
>>
>> +extern const char *get_procsignal_reason_desc(ProcSignalReason reason);
>>
>> Is this garbage?
>
> Yes.
>
>>
>> When log_lock_waits is enabled, both "still waiting for ..." and "acquired ..."
>> messages are output. Like this, when log_recovery_conflict_waits is enabled,
>> not only "still waiting ..." but also "resolved ..." message should be output?
>> The latter message might not need to be output if the conflict is canceled
>> due to max_standby_xxx_delay parameter. The latter message would be
>> useful to know how long the recovery was waiting for the conflict. Thought?
>> It's ok to implement this as a separate patch later, though.
>
> There was a discussion that the latter message without waiting time is
> not necessarily needed because the canceled process will log
> "canceling statement due to conflict with XXX" as you mentioned. I
> agreed with that. But I agree that the latter message with waiting
> time would help users, for instance when the startup process is
> waiting for multiple processes and it takes a time to cancel all of
> them.
I agree that it's useful to output the wait time.
But as I told in previous email, it's ok to focus on the current patch
for now and then implement this as a separate patch later.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2020-10-30 03:28:51 | Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8 |
Previous Message | Tom Lane | 2020-10-30 03:24:32 | Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8 |