From: | Paul Guo <paulguo(at)gmail(dot)com> |
---|---|
To: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case? |
Date: | 2021-05-27 13:59:10 |
Message-ID: | CABQrizf0Ji-VnLGokbfGSxLM2R+j0d4kShAh+tVw1r-1M-MY-g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, May 27, 2021 at 7:11 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Tue, May 25, 2021 at 4:39 PM Paul Guo <paulguo(at)gmail(dot)com> wrote:
> >
> > Hi hackers,
> >
> > I found this when reading the related code. Here is the scenario:
> >
> > bool
> > RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
> > bool retryOnError)
> >
> > For the case retryOnError is true, the function would in loop call
> > ForwardSyncRequest() until it succeeds, but in ForwardSyncRequest(),
> > we can see if we run into the below branch, RegisterSyncRequest() will
> > need to loop until the checkpointer absorbs the existing requests so
> > ForwardSyncRequest() might hang for some time until a checkpoint
> > request is triggered. This scenario seems to be possible in theory
> > though the chance is not high.
>
> It seems like a really unlikely scenario, but maybe possible if you
> use a lot of unlogged tables maybe (as you could eventually
> dirty/evict more than NBuffers buffers without triggering enough WALs
> activity to trigger a checkpoint with any sane checkpoint
> configuration).
RegisterSyncRequest() handles SYNC_UNLINK_REQUEST and
SYNC_FORGET_REQUEST scenarios, besides the usual SYNC_REQUEST type for
buffer sync.
> > ForwardSyncRequest():
> >
> > if (CheckpointerShmem->checkpointer_pid == 0 ||
> > (CheckpointerShmem->num_requests >= CheckpointerShmem->max_requests &&
> > !CompactCheckpointerRequestQueue()))
> > {
> > /*
> > * Count the subset of writes where backends have to do their own
> > * fsync
> > */
> > if (!AmBackgroundWriterProcess())
> > CheckpointerShmem->num_backend_fsync++;
> > LWLockRelease(CheckpointerCommLock);
> > return false;
> > }
> >
> > One fix is to add below similar code in RegisterSyncRequest(), trigger
> > a checkpoint for the scenario.
> >
> > // checkpointer_triggered: variable for one trigger only.
> > if (!ret && retryOnError && ProcGlobal->checkpointerLatch &&
> > !checkpointer_triggered)
> > SetLatch(ProcGlobal->checkpointerLatch);
> >
> > Any comments?
>
> It looks like you intended to set the checkpointer_triggered var but
Yes this is just pseduo code.
> didn't. Also this will wake up the checkpointer but won't force a
> checkpoint (unlike RequestCheckpoint()). It may be a good thing
I do not expect an immediate checkpoint. AbsorbSyncRequests()
is enough since after that RegisterSyncRequest() could finish.
> though as it would only absorb the requests and go back to sleep if no
> other threshold is reachrf. Apart from the implementation details it
> seems like it could help in this unlikely event.
--
Paul Guo (Vmware)
From | Date | Subject | |
---|---|---|---|
Next Message | Paul Guo | 2021-05-27 14:04:53 | Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case? |
Previous Message | Paul Guo | 2021-05-27 13:50:28 | Re: pg_rewind fails if there is a read only file. |