Re: AIO v2.0

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, 陈宗志 <baotiao(at)gmail(dot)com>
Subject: Re: AIO v2.0
Date: 2025-01-06 16:28:39
Message-ID: 6vjl6jeaqvyhfbpgwziypwmhem2rwla4o5pgpuxwtg3o3o3jb5@evyzorb5meth
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-12-19 17:29:12 -0500, Andres Freund wrote:
> > Not about patch itself, but questions about related stack functionality:
> > ----------------------------------------------------------------------------------------------------
> >
> >
> > 7. Is pg_stat_aios still on the table or not ? (AIO 2021 had it). Any hints
> > on how to inspect real I/O calls requested to review if the code is issuing
> > sensible calls: there's no strace for uring, or do you stick to DEBUG3 or
> > perhaps using some bpftrace / xfsslower is the best way to go ?
>
> I think we still want something like it, but I don't think it needs to be in
> the initial commits.

After I got this question from Thomas as well, I started hacking one up.

What information would you like to see?

Here's what I currently have:
┌─[ RECORD 1 ]───┬────────────────────────────────────────────────┐
│ pid │ 358212 │
│ io_id │ 2050 │
│ io_generation │ 4209 │
│ state │ COMPLETED_SHARED │
│ operation │ read │
│ offset │ 509083648 │
│ length │ 262144 │
│ subject │ smgr │
│ iovec_data_len │ 32 │
│ raw_result │ 262144 │
│ result │ OK │
│ error_desc │ (null) │
│ subject_desc │ blocks 1372864..1372895 in file "base/5/16388" │
│ flag_sync │ f │
│ flag_localmem │ f │
│ flag_buffered │ t │
├─[ RECORD 2 ]───┼────────────────────────────────────────────────┤
│ pid │ 358212 │
│ io_id │ 2051 │
│ io_generation │ 4199 │
│ state │ IN_FLIGHT │
│ operation │ read │
│ offset │ 511967232 │
│ length │ 262144 │
│ subject │ smgr │
│ iovec_data_len │ 32 │
│ raw_result │ (null) │
│ result │ UNKNOWN │
│ error_desc │ (null) │
│ subject_desc │ blocks 1373216..1373247 in file "base/5/16388" │
│ flag_sync │ f │
│ flag_localmem │ f │
│ flag_buffered │ t │

I didn't think that pg_stat_* was quite the right namespace, given that it
shows not stats, but the currently ongoing IOs. I am going with pg_aios for
now, but I don't particularly like that.

I think we'll want a pg_stat_aio as well, tracking things like:

- how often the queue to IO workes was full
- how many times we submitted IO to the kernel (<= #ios with io_uring)
- how many times we asked the kernel for events (<= #ios with io_uring)
- how many times we had to wait for in-flight IOs before issuing more IOs

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2025-01-06 16:41:20 Re: Fwd: Re: A new look at old NFS readdir() problems?
Previous Message Andrew Dunstan 2025-01-06 16:22:53 Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)