From: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Some problems of recovery conflict wait events |
Date: | 2020-02-29 03:36:30 |
Message-ID: | CA+fd4k7_f6-yQLiwH0YVKN-J2C1NRbOJxF1LbAZW=kn-98X4=w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 26 Feb 2020 at 16:19, Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Tue, 18 Feb 2020 at 17:58, Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > Hi all,
> >
> > When recovery conflicts happen on the streaming replication standby,
> > the wait event of startup process is null when
> > max_standby_streaming_delay = 0 (to be exact, when the limit time
> > calculated by max_standby_streaming_delay is behind the last WAL data
> > receipt time is behind). Moreover the process title of waiting startup
> > process looks odd in the case of lock conflicts.
> >
> > 1. When max_standby_streaming_delay > 0 and the startup process
> > conflicts with a lock,
> >
> > * wait event
> > backend_type | wait_event_type | wait_event
> > --------------+-----------------+------------
> > startup | Lock | relation
> > (1 row)
> >
> > * ps
> > 42513 ?? Ss 0:00.05 postgres: startup recovering
> > 000000010000000000000003 waiting
> >
> > Looks good.
> >
> > 2. When max_standby_streaming_delay > 0 and the startup process
> > conflicts with a snapshot,
> >
> > * wait event
> > backend_type | wait_event_type | wait_event
> > --------------+-----------------+------------
> > startup | |
> > (1 row)
> >
> > * ps
> > 44299 ?? Ss 0:00.05 postgres: startup recovering
> > 000000010000000000000003 waiting
> >
> > wait_event_type and wait_event are null in spite of waiting for
> > conflict resolution.
> >
> > 3. When max_standby_streaming_delay > 0 and the startup process
> > conflicts with a lock,
> >
> > * wait event
> > backend_type | wait_event_type | wait_event
> > --------------+-----------------+------------
> > startup | |
> > (1 row)
> >
> > * ps
> > 46510 ?? Ss 0:00.05 postgres: startup recovering
> > 000000010000000000000003 waiting waiting
> >
> > wait_event_type and wait_event are null and the process title is
> > wrong; "waiting" appears twice.
> >
> > The cause of the first problem, wait_event_type and wait_event are not
> > set, is that WaitExceedsMaxStandbyDelay which is called by
> > ResolveRecoveryConflictWithVirtualXIDs waits for other transactions
> > using pg_usleep rather than WaitLatch. I think we can change it so
> > that it uses WaitLatch and those caller passes wait event information.
> >
> > For the second problem, wrong process title, the cause is also
> > relevant with ResolveRecoveryConflictWithVirtualXIDs; in case of lock
> > conflicts we add "waiting" to the process title in WaitOnLock but we
> > add it again in ResolveRecoveryConflictWithVirtualXIDs. I think we can
> > have WaitOnLock not set process title in recovery case.
> >
> > This problem exists on 12, 11 and 10. I'll submit the patch.
> >
>
> I've attached patches that fix the above two issues.
>
> 0001 patch fixes the first problem. Currently there are 5 types of
> recovery conflict resolution: snapshot, tablespace, lock, database and
> buffer pin, and we set wait events to only 2 events out of 5: lock
> (only when doing ProcWaitForSignal) and buffer pin. Therefore, users
> cannot know that the startup process is waiting or not, and what
> waiting for. This patch sets wait events to more 3 events: snapshot,
> tablespace and lock. For wait events of those 3 events, I thought that
> we can create a new more appropriate wait event type, say
> RecoveryConflict, and set it for them. However, considering
> back-patching to existing versions, adding new wait event type would
> not be acceptable. So this patch sets existing wait events such as
> PG_WAIT_LOCK to those 3 places and doesn't not set a wait event for
> conflict resolution on dropping database because there is not an
> appropriate existing one. I'll start a separate thread about
> improvement on wait events of recovery conflict resolution for PG13 if
> necessary.
Attached a patch improves wait events of recovery conflict resolution.
It's for PG13. I added new RecoveryConflict wait_event_type and some
wait_event. This patch can be applied on top of two patches I already
proposed.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
0003-Improve-wait-events-of-recovery-conflict-resolution.patch | application/octet-stream | 7.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2020-02-29 05:07:44 | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Previous Message | Justin Pryzby | 2020-02-29 02:42:02 | Re: ALTER tbl rewrite loses CLUSTER ON index |