Re: Add Information during standby recovery conflicts

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add Information during standby recovery conflicts
Date: 2020-11-30 18:04:07
Message-ID: fafa607c-9e85-b742-ab46-2ee807ab9364@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/11/30 16:26, Masahiko Sawada wrote:
> On Mon, Nov 30, 2020 at 3:46 PM Drouvot, Bertrand <bdrouvot(at)amazon(dot)com> wrote:
>>
>> Hi,
>>
>> On 11/30/20 4:41 AM, Masahiko Sawada wrote:
>>> On Sun, Nov 29, 2020 at 3:47 PM Drouvot, Bertrand <bdrouvot(at)amazon(dot)com> wrote:
>>>> Hi Alvaro,
>>>>
>>>> On 11/28/20 6:36 PM, Alvaro Herrera wrote:
>>>>> Hi Bertrand,
>>>>>
>>>>> On 2020-Nov-28, Drouvot, Bertrand wrote:
>>>>>
>>>>>> + if (nprocs > 0)
>>>>>> + {
>>>>>> + ereport(LOG,
>>>>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s",
>>>>>> + msecs, usecs, _(get_recovery_conflict_desc(reason))),
>>>>>> + (errdetail_log_plural("Conflicting process: %s.",
>>>>>> + "Conflicting processes: %s.",
>>>>>> + nprocs, buf.data))));
>>>>>> + }
>>>>>> + else
>>>>>> + {
>>>>>> + ereport(LOG,
>>>>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s",
>>>>>> + msecs, usecs, _(get_recovery_conflict_desc(reason)))));
>>>>>> + }
>>>>>> +
>>>>>> + pfree(buf.data);
>>>>>> + }
>>>>>> + else
>>>>>> + ereport(LOG,
>>>>>> + (errmsg("recovery still waiting after %ld.%03d ms: %s",
>>>>>> + msecs, usecs, _(get_recovery_conflict_desc(reason)))));
>>>>>> +}
>>>>> Another trivial stylistic point is that you can collapse all these
>>>>> ereport calls into one, with something like
>>>>>
>>>>> ereport(LOG,
>>>>> errmsg("recovery still waiting after ...", opts),
>>>>> waitlist != NULL ? errdetail_log_plural("foo bar baz", opts) : 0);
>>>>>
>>>>> where the "waitlist" has been constructed beforehand, or is set to NULL
>>>>> if there's no process list.
>>>> Nice!
>>>>
>>>>>> + switch (reason)
>>>>>> + {
>>>>>> + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
>>>>>> + reasonDesc = gettext_noop("for recovery conflict on buffer pin");
>>>>>> + break;
>>>>> Pure bikeshedding after discussing this with my pillow: I think I'd get
>>>>> rid of the initial "for" in these messages.
>>>> both comments implemented in the new attached version.
>>>>
>>> Thank you for updating the patch!
>>>
>>> + /* Also, set deadlock timeout for logging purpose if necessary */
>>> + if (log_recovery_conflict_waits && !need_log)
>>> + {
>>> + timeouts[cnt].id = STANDBY_TIMEOUT;
>>> + timeouts[cnt].type = TMPARAM_AFTER;
>>> + timeouts[cnt].delay_ms = DeadlockTimeout;
>>> + cnt++;
>>> + }
>>>
>>> You changed to 'need_log' but this condition seems not correct. I
>>> think we need to set this timeout when both log_recovery_conflict and
>>> need_log is true.
>>
>> Nice catch!
>>
>> In fact it behaves correctly, that's jut the 'need_log' name that is
>> miss leading: I renamed it to 'already_logged' in the new attached version.
>>
>
> Thanks! I'd prefer 'need_log' because we can check it using the
> affirmative form in that condition, which would make the code more
> readable a bit. But I'd like to leave it to committers. I've marked
> this patch as "Ready for Committer".

I'm still in the middle of the review, but please let me share
my current review comments.

+ /* Set wait start timestamp if logging is enabled */
+ if (log_recovery_conflict_waits)
+ waitStart = GetCurrentTimestamp();

This seems to cause even the primary server to call GetCurrentTimestamp()
if logging is enabled. To avoid unnecessary GetCurrentTimestamp(),
we should add "InHotStandby" into the if-condition?

+ initStringInfo(&buf);
+
+ if (wait_list)

Isn't it better to call initStringInfo() only when wait_list is not NULL?
For example, we can update the code so that it's executed when nprocs == 0.

+ if (proc)
+ {
+ if (nprocs == 0)
+ appendStringInfo(&buf, "%d", proc->pid);
+ else
+ appendStringInfo(&buf, ", %d", proc->pid);
+
+ nprocs++;

What happens if all the backends in wait_list have gone? In other words,
how should we handle the case where nprocs == 0 (i.e., nprocs has not been
incrmented at all)? This would very rarely happen, but can happen.
In this case, since buf.data is empty, at least there seems no need to log
the list of conflicting processes in detail message.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-11-30 18:21:15 Re: Improving spin-lock implementation on ARM.
Previous Message Fujii Masao 2020-11-30 17:59:15 Re: Add Information during standby recovery conflicts