Re: AIO v2.5

From: Andres Freund <andres(at)anarazel(dot)de>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
Subject: Re: AIO v2.5
Date: 2025-03-21 01:58:37
Message-ID: 6ak556uyqiptdwjaci4kbi5eykwkmzqgkbtkyaosjnopjhncrc@2v4ac2jwyz22
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Attached v2.11, with the following changes:

- Pushed the smgr interrupt change, as discussed on the dedicated thread

- Pushed "bufmgr: Improve stats when a buffer is read in concurrently"

It was reviewed by Melanie and there didn't seem to be any reason to wait
further.

- Addressed feedback from Melanie

- Addressed feedback from Noah

- Added a new commit: aio: Change prefix of PgAioResultStatus values to PGAIO_RS_

As suggested/requested by Melanie. I think she's unfortunately right.

- Added a patch for some comment fixups for code that's either older or
already pushed

- Added an error check for FileStartReadV() failing

FileStartReadV() actually can fail, if the file can't be re-opened. I
thought it'd be important for the error message to differ from the one
that's issued for read actually failing, so I went with:

"could not start reading blocks %u..%u in file \"%s\": %m"

but I'm not sure how good that is.

- Added a new commit to redefine set_max_safe_fds() to not subtract
already_open fds from max_files_per_process

This prevents io_method=io_uring from failing when RLIMIT_NOFILE is high
enough, but more than max_files_per_process io_uring instances need to be
created.

- Improved error message if io_uring_queue_init() fails

Added errhint()s for likely cases of failure.

Added errcode(). I was tempted to use errcode_for_file_access(), but that
doesn't support ENOSYS - perhaps I should add that instead?

- Disable io_uring method when using EXEC_BACKEND, they're not compatible

I chose to do this with a define aio.h, but I guess we could also do it at
configure time? That seems more complicated though - how would we even know
that EXEC_BACKEND is used on non-windows?

Not sure yet how to best disable testing io_uring in this case. We can't
just query EXEC_BACKEND from pg_config.h unfortunately. I guess making the
initdb not fail and checking the error log would work, but that doesn't work
nicely with Cluster.pm.

- Changed test_aio injection short-read injection point to zero out the rest
of the IO, otherwise some tests fail to fail even if a bug in retries of
partial reads is introduced

- Improved method_io_uring.c includes a bit (no pgstat.h)

Questions:

- We only "look" at BM_IO_ERROR for writes, isn't that somewhat weird?

See AbortBufferIO(Buffer buffer)

It doesn't really matter for the patchset, but it just strikes me as an oddity.

Greetings,

Andres Freund

Attachment Content-Type Size
v2.11-0001-aio-bufmgr-Comment-fixes.patch text/x-diff 2.8 KB
v2.11-0002-aio-Change-prefix-of-PgAioResultStatus-values-.patch text/x-diff 4.0 KB
v2.11-0003-Redefine-max_files_per_process-to-control-addi.patch text/x-diff 4.4 KB
v2.11-0004-aio-Add-liburing-dependency.patch text/x-diff 13.4 KB
v2.11-0005-aio-Add-io_method-io_uring.patch text/x-diff 21.4 KB
v2.11-0006-aio-Implement-support-for-reads-in-smgr-md-fd.patch text/x-diff 22.8 KB
v2.11-0007-aio-Add-README.md-explaining-higher-level-desi.patch text/x-diff 19.0 KB
v2.11-0008-localbuf-Track-pincount-in-BufferDesc-as-well.patch text/x-diff 4.5 KB
v2.11-0009-bufmgr-Implement-AIO-read-support.patch text/x-diff 30.1 KB
v2.11-0010-Support-buffer-forwarding-in-read_stream.c.patch text/x-diff 8.7 KB
v2.11-0011-Support-buffer-forwarding-in-StartReadBuffers.patch text/x-diff 10.5 KB
v2.11-0012-bufmgr-Use-AIO-in-StartReadBuffers.patch text/x-diff 26.8 KB
v2.11-0013-aio-Basic-read_stream-adjustments-for-real-AIO.patch text/x-diff 3.8 KB
v2.11-0014-docs-Reframe-track_io_timing-related-docs-as-w.patch text/x-diff 5.3 KB
v2.11-0015-Enable-IO-concurrency-on-all-systems.patch text/x-diff 11.2 KB
v2.11-0016-aio-Add-pg_aios-view.patch text/x-diff 9.8 KB
v2.11-0017-aio-Add-test_aio-module.patch text/x-diff 45.0 KB
v2.11-0018-aio-Experimental-heuristics-to-increase-batchi.patch text/x-diff 1.6 KB
v2.11-0019-aio-Implement-smgr-md-fd-write-support.patch text/x-diff 13.3 KB
v2.11-0020-aio-Add-bounce-buffers.patch text/x-diff 23.1 KB
v2.11-0021-bufmgr-Implement-AIO-write-support.patch text/x-diff 9.8 KB
v2.11-0022-aio-Add-IO-queue-helper.patch text/x-diff 7.4 KB
v2.11-0023-bufmgr-use-AIO-in-checkpointer-bgwriter.patch text/x-diff 31.3 KB
v2.11-0024-Ensure-a-resowner-exists-for-all-paths-that-ma.patch text/x-diff 2.6 KB
v2.11-0025-Temporary-Increase-BAS_BULKREAD-size.patch text/x-diff 1.3 KB
v2.11-0026-WIP-Use-MAP_POPULATE.patch text/x-diff 1.1 KB
v2.11-0027-StartReadBuffers-debug-stuff.patch text/x-diff 1.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2025-03-21 02:50:11 Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Previous Message Sami Imseih 2025-03-21 01:46:55 Re: Proposal - Allow extensions to set a Plan Identifier