From: | James Hunter <james(dot)hunter(dot)pg(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
Subject: | Re: AIO v2.3 |
Date: | 2025-02-11 17:10:51 |
Message-ID: | CAJVSvF4Cpt4T8733x=66AXs16WxFKQUjkrpS+1-5sX_qe=1dbQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Feb 10, 2025 at 2:40 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
...
> Problem statement: You want to be able to batch I/O submission, ie
> make a single call to ioring_enter() (and other mechanisms) to start
> several I/Os, but the code that submits is inside StartReadBuffers()
> and the code that knows how many I/Os it wants to start now is at a
> higher level, read_stream.c and in future elsewhere. So you invented
> this flag to tell StartReadBuffers() not to call
> pgaio_submit_staged(), because you promise to do it later, via this
> staging list. Additionally, there is a kind of programming rule here
> that you *must* submit I/Os that you stage, you aren't allowed to (for
> example) stage I/Os and then sleep, so it has to be a fairly tight
> piece of code.
>
> Would the API be better like this?: When you want to create a batch
> of I/Os submitted together, you wrap the work in pgaio_begin_batch()
> and pgaio_submit_batch(), eg the loop in read_stream_lookahead().
> Then bufmgr wouldn't need this flag: when it (or anything else) calls
> smgrstartreadv(), if there is not currently an explicit batch then it
> would be submitted immediately, and otherwise it would only be staged.
> This way, batch construction (or whatever word you prefer for batch)
> is in a clearly and explicitly demarcated stretch of code in one
> lexical scope (though its effect is dynamically scoped just like the
> staging list itself because we don't want to pass explicit I/O
> contexts through the layers), but code that doesn't call those and
> reaches AsyncReadBuffer() or whatever gets an implicit batch of size
> one and that's also OK. Not sure what semantics nesting would have
> but I doubt it matters much.
I like this idea. If we want to submit a batch, then just submit a batch.
James
From | Date | Subject | |
---|---|---|---|
Next Message | Andrei Lepikhov | 2025-02-11 17:14:49 | Re: explain analyze rows=%.0f |
Previous Message | Tom Lane | 2025-02-11 16:53:37 | Re: [PATCH] snowball: fix potential NULL dereference |