Re: Some problems of recovery conflict wait events

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Some problems of recovery conflict wait events
Date: 2020-04-02 07:12:11
Message-ID: daa8896c-abba-95ad-65e5-269b52e619ed@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/04/02 15:54, Masahiko Sawada wrote:
> On Thu, 2 Apr 2020 at 15:34, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>
>>
>>
>> On 2020/04/02 14:25, Masahiko Sawada wrote:
>>> On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>
>>>>
>>>>
>>>> On 2020/03/30 20:10, Masahiko Sawada wrote:
>>>>> On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2020/03/04 14:31, Masahiko Sawada wrote:
>>>>>>> On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/04 13:27, Michael Paquier wrote:
>>>>>>>>> On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:
>>>>>>>>>> Yeah, so 0001 patch sets existing wait events to recovery conflict
>>>>>>>>>> resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
>>>>>>>>>> to the recovery conflict on a snapshot. 0003 patch improves these wait
>>>>>>>>>> events by adding the new type of wait event such as
>>>>>>>>>> WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
>>>>>>>>>> is the fix for existing versions and 0003 patch is an improvement for
>>>>>>>>>> only PG13. Did you mean even 0001 patch doesn't fit for back-patching?
>>>>>>>>
>>>>>>>> Yes, it looks like a improvement rather than bug fix.
>>>>>>>>
>>>>>>>
>>>>>>> Okay, understand.
>>>>>>>
>>>>>>>>> I got my eyes on this patch set. The full patch set is in my opinion
>>>>>>>>> just a set of improvements, and not bug fixes, so I would refrain from
>>>>>>>>> back-backpatching.
>>>>>>>>
>>>>>>>> I think that the issue (i.e., "waiting" is reported twice needlessly
>>>>>>>> in PS display) that 0002 patch tries to fix is a bug. So it should be
>>>>>>>> fixed even in the back branches.
>>>>>>>
>>>>>>> So we need only two patches: one fixes process title issue and another
>>>>>>> improve wait event. I've attached updated patches.
>>>>>>
>>>>>> I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
>>>>>>
>>>>>> - ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
>>>>>> + ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
>>>>>>
>>>>>> Currently the wait event indicating the wait for buffer pin has already
>>>>>> been reported. But the above change in the patch changes the name of
>>>>>> wait event for buffer pin only in the startup process. Is this really useful?
>>>>>> Isn't the existing wait event for buffer pin enough?
>>>>>>
>>>>>> - /* Wait to be signaled by the release of the Relation Lock */
>>>>>> - ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>>>> + /* Wait to be signaled by the release of the Relation Lock */
>>>>>> + ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
>>>>>>
>>>>>> Same as above. Isn't the existing wait event enough?
>>>>>
>>>>> Yeah, we can use the existing wait events for buffer pin and lock.
>>>>>
>>>>>>
>>>>>> - /*
>>>>>> - * Progressively increase the sleep times, but not to more than 1s, since
>>>>>> - * pg_usleep isn't interruptible on some platforms.
>>>>>> - */
>>>>>> - standbyWait_us *= 2;
>>>>>> - if (standbyWait_us > 1000000)
>>>>>> - standbyWait_us = 1000000;
>>>>>> + WaitLatch(MyLatch,
>>>>>> + WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
>>>>>> + STANDBY_WAIT_MS,
>>>>>> + wait_event_info);
>>>>>> + ResetLatch(MyLatch);
>>>>>>
>>>>>> ResetLatch() should be called before WaitLatch()?
>>>>>
>>>>> Fixed.
>>>>>
>>>>>>
>>>>>> Could you tell me why you dropped the "increase-sleep-times" logic?
>>>>>
>>>>> I thought we can remove it because WaitLatch is interruptible but my
>>>>> observation was not correct. The waiting startup process is not
>>>>> necessarily woken up by signal. I think it's still better to not wait
>>>>> more than 1 sec even if it's an interruptible wait.
>>>>
>>>> So we don't need to use WaitLatch() there, i.e., it's ok to keep using
>>>> pg_usleep()?
>>>>
>>>>> Attached patch fixes the above and introduces only two wait events of
>>>>> conflict resolution: snapshot and tablespace.
>>>>
>>>> Many thanks for updating the patch!
>>>>
>>>> - /* Wait to be signaled by the release of the Relation Lock */
>>>> - ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>> + /* Wait to be signaled by the release of the Relation Lock */
>>>> + ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>> + }
>>>>
>>>> Is this change really valid? What happens if the latch is set during
>>>> ResolveRecoveryConflictWithVirtualXIDs()?
>>>> ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
>>>> is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.
>>>
>>> Thank you for reviewing the patch!
>>>
>>> You're right. It's better to keep using pg_usleep() and set the wait
>>> event by pgstat_report_wait_start().
>>>
>>>>
>>>> + default:
>>>> + event_name = "unknown wait event";
>>>> + break;
>>>>
>>>> Seems this default case should be removed. Please see other
>>>> pgstat_get_wait_xxx() function, so there is no such code.
>>>>
>>>>> I also removed the wait
>>>>> event of conflict resolution of database since it's unlikely to become
>>>>> a user-visible and a long sleep as we discussed before.
>>>>
>>>> Is it worth defining new wait event type RecoveryConflict only for
>>>> those two events? ISTM that it's ok to use IPC event type here.
>>>>
>>>
>>> I dropped a new wait even type and added them to IPC wait event type.
>>>
>>> I've attached the new version patch.
>>
>> Thanks for updating the patch! The patch looks good to me except
>> the following mior things.
>>
>> + <row>
>> + <entry><literal>RecoveryConflictSnapshot</literal></entry>
>> + <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
>> + </row>
>> + <row>
>> + <entry><literal>RecoveryConflictTablespace</literal></entry>
>> + <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
>> + </row>
>>
>> You need to increment the value of "morerows" in
>> "<entry morerows="38"><literal>IPC</literal></entry>".
>>
>> The descriptions of those two events should be placed in alphabetical order
>> for event name. That is, they should be placed above RecoveryPause.
>>
>> "vacuum cleanup" is better than "physical cleanup"?
>
> Agreed.
>
> I've attached the updated version patch.

Thanks! Looks good to me. Barring any objection, I will commit this patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-04-02 07:22:55 Re: color by default
Previous Message Alvaro Herrera 2020-04-02 07:01:56 Re: pgbench - add pseudo-random permutation function