From: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Some problems of recovery conflict wait events |
Date: | 2020-03-30 11:10:01 |
Message-ID: | CA+fd4k51DA2sYvAzQ5wKzrbgtRmLmuDQ0i5F4y0i2dat5+4yaA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
>
>
> On 2020/03/04 14:31, Masahiko Sawada wrote:
> > On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> >>
> >>
> >>
> >> On 2020/03/04 13:27, Michael Paquier wrote:
> >>> On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:
> >>>> Yeah, so 0001 patch sets existing wait events to recovery conflict
> >>>> resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
> >>>> to the recovery conflict on a snapshot. 0003 patch improves these wait
> >>>> events by adding the new type of wait event such as
> >>>> WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
> >>>> is the fix for existing versions and 0003 patch is an improvement for
> >>>> only PG13. Did you mean even 0001 patch doesn't fit for back-patching?
> >>
> >> Yes, it looks like a improvement rather than bug fix.
> >>
> >
> > Okay, understand.
> >
> >>> I got my eyes on this patch set. The full patch set is in my opinion
> >>> just a set of improvements, and not bug fixes, so I would refrain from
> >>> back-backpatching.
> >>
> >> I think that the issue (i.e., "waiting" is reported twice needlessly
> >> in PS display) that 0002 patch tries to fix is a bug. So it should be
> >> fixed even in the back branches.
> >
> > So we need only two patches: one fixes process title issue and another
> > improve wait event. I've attached updated patches.
>
> I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
>
> - ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
> + ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
>
> Currently the wait event indicating the wait for buffer pin has already
> been reported. But the above change in the patch changes the name of
> wait event for buffer pin only in the startup process. Is this really useful?
> Isn't the existing wait event for buffer pin enough?
>
> - /* Wait to be signaled by the release of the Relation Lock */
> - ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
> + /* Wait to be signaled by the release of the Relation Lock */
> + ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
>
> Same as above. Isn't the existing wait event enough?
Yeah, we can use the existing wait events for buffer pin and lock.
>
> - /*
> - * Progressively increase the sleep times, but not to more than 1s, since
> - * pg_usleep isn't interruptible on some platforms.
> - */
> - standbyWait_us *= 2;
> - if (standbyWait_us > 1000000)
> - standbyWait_us = 1000000;
> + WaitLatch(MyLatch,
> + WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
> + STANDBY_WAIT_MS,
> + wait_event_info);
> + ResetLatch(MyLatch);
>
> ResetLatch() should be called before WaitLatch()?
Fixed.
>
> Could you tell me why you dropped the "increase-sleep-times" logic?
I thought we can remove it because WaitLatch is interruptible but my
observation was not correct. The waiting startup process is not
necessarily woken up by signal. I think it's still better to not wait
more than 1 sec even if it's an interruptible wait.
Attached patch fixes the above and introduces only two wait events of
conflict resolution: snapshot and tablespace. I also removed the wait
event of conflict resolution of database since it's unlikely to become
a user-visible and a long sleep as we discussed before.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
recovery_conflict_wait_event_v3.patch | application/octet-stream | 7.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | k.jamison@fujitsu.com | 2020-03-30 11:59:08 | RE: [Patch] Optimize dropping of relation buffers using dlist |
Previous Message | Rajkumar Raghuwanshi | 2020-03-30 10:43:47 | Re: WIP/PoC for parallel backup |