From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | thomas(dot)munro(at)gmail(dot)com |
Cc: | andres(at)anarazel(dot)de, myon(at)debian(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call |
Date: | 2023-05-24 01:56:28 |
Message-ID: | 20230524.105628.2202582176350190855.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
At Wed, 26 Apr 2023 11:37:55 +1200, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote in
> On Tue, Apr 25, 2023 at 12:16 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2023-04-24 15:32:25 -0700, Andres Freund wrote:
> > > We obviously can add a retry loop to FileFallocate(), similar to what's
> > > already present e.g. in FileRead(). But I wonder if we shouldn't go a bit
> > > further, and do it for all the fd.c routines where it's remotely plausible
> > > EINTR could be returned? It's a bit silly to add EINTR retries one-by-one to
> > > the functions.
> > >
> > >
> > > The following are documented to potentially return EINTR, without fd.c having
> > > code to retry:
> > >
> > > - FileWriteback() / pg_flush_data()
> > > - FileSync() / pg_fsync()
> > > - FileFallocate()
> > > - FileTruncate()
> > >
> > > With the first two there's the added complication that it's not entirely
> > > obvious whether it'd be better to handle this in File* or pg_*. I'd argue the
> > > latter is a bit more sensible?
> >
> > A prototype of that approach is attached. I pushed the retry handling into the
> > pg_* routines where applicable. I guess we could add pg_* routines for
> > FileFallocate(), FilePrewarm() etc as well, but I didn't do that here.
>
> One problem we ran into with the the shm_open() case (which is nearly
> identical under the covers, since shm_open() just opens a file in a
> tmpfs on Linux) was that a simple retry loop like this could never
> terminate if the process was receiving a lot of signals from the
> recovery process, which is why we went with the idea of masking
> signals instead. Eventually we should probably grow the file in
> smaller chunks with a CFI in between so that we both guarantee that we
> make progress (by masking for smaller size increases) and service
> interrupts in a timely fashion (by unmasking between loops). I don't
> think that applies here because we're not trying to fallocate
> humongous size increases in one go, but I just want to note that we're
> making a different choice. I think this looks reasonable for the use
> cases we actually have here.
FWIW I share the same feeling about looping by EINTR without signals
being blocked. If we just retry the same operation without processing
signals after getting EINTR, I think blocking signals is better. We
could block signals more gracefully, but I'm not sure it's worth the
complexity.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2023-05-24 02:28:45 | Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call |
Previous Message | Michael Paquier | 2023-05-24 01:42:03 | Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call |
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2023-05-24 02:01:03 | Re: Docs: Encourage strong server verification with SCRAM |
Previous Message | Michael Paquier | 2023-05-24 01:56:01 | Re: Docs: Encourage strong server verification with SCRAM |