Re: AIO v2.5

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
Subject: Re: AIO v2.5
Date: 2025-03-20 18:19:04
Message-ID: 20250320181904.8a.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 20, 2025 at 01:05:05PM -0400, Andres Freund wrote:
> On 2025-03-19 18:17:37 -0400, Andres Freund wrote:
> > On 2025-03-19 14:25:30 -0700, Noah Misch wrote:
> > > > + * marked as failed. In case of a partial read, some buffers may be
> > > > + * ok.
> > > > + */
> > > > + failed =
> > > > + prior_result.status == ARS_ERROR
> > > > + || prior_result.result <= buf_off;
> > >
> > > I didn't run an experiment to check the following, but I think this should be
> > > s/<=/</. Suppose we requested two blocks and read some amount of bytes
> > > [1*BLCKSZ, 2*BLSCKSZ - 1]. md_readv_complete will store result=1. buf_off==0
> > > should compute failed=false here, but buf_off==1 should compute failed=true.
> >
> > Huh, you might be right. I thought I wrote a test for this, I wonder why it
> > didn't catch the problem...
>
> It was correct as-is. With result=1 you get precisely the result you describe
> as the desired outcome, no?
> prior_result.result <= buf_off
> ->
> 1 <= 0 -> failed = 0
> 1 <= 1 -> failed = 1
>
> but if it were < as you suggest:
>
> prior_result.result < buf_off
> ->
> 1 < 0 -> failed = 0
> 1 < 1 -> failed = 0
>
> I.e. we would assume that the second buffer also completed.

That's right. I see it now. My mistake.

> What does concern me is that the existing tests do *not* catch the problem if
> I turn "<=" into "<". The second buffer in this case wrongly gets marked as
> valid. We do retry the read (because bufmgr.c thinks only one block was read),
> but find the buffer to already be valid.
>
> The reason the test doesn't fail, is that the way I set up the "short read"
> tests. The injection point runs after the IO completed and just modifies the
> result. However, the actual buffer contents still got modified.
>
>
> The easiest way around that seems to be to have the injection point actually
> zero out the remaining memory.

Sounds reasonable and sufficient.

FYI, I've resumed the comprehensive review. That's still ongoing.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-03-20 18:24:49 Re: making EXPLAIN extensible
Previous Message Nathan Bossart 2025-03-20 18:13:02 Re: Disabling vacuum truncate for autovacuum