From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> |
Subject: | Re: AIO v2.5 |
Date: | 2025-03-25 18:58:37 |
Message-ID: | 5ons2rtmwarqqhhexb3dnqulw5rjgwgoct57vpdau4rujlrffj@3fls6d2mkiwc |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2025-03-25 08:58:08 -0700, Noah Misch wrote:
> While having nagging thoughts that we might be releasing FDs before io_uring
> gets them into kernel custody, I tried this hack to maximize FD turnover:
>
> static void
> ReleaseLruFiles(void)
> {
> #if 0
> while (nfile + numAllocatedDescs + numExternalFDs >= max_safe_fds)
> {
> if (!ReleaseLruFile())
> break;
> }
> #else
> while (ReleaseLruFile())
> ;
> #endif
> }
>
> "make check" with default settings (io_method=worker) passes, but
> io_method=io_uring in the TEMP_CONFIG file got different diffs in each of two
> runs. s/#if 0/#if 1/ (restore normal FD turnover) removes the failures.
> Here's the richer of the two diffs:
Yikes. That's a very good catch.
I spent a bit of time debugging this. I think I see what's going on - it turns
out that the kernel does *not* open the FDs during io_uring_enter() if
IOSQE_ASYNC is specified [1]. Which we do add heuristically, in an attempt to
avoid a small but measurable slowdown for sequential scans that are fully
buffered (c.f. pgaio_uring_submit()). If I disable that heuristic, your patch
above passes all tests here.
I don't know if that's an intentional or unintentional behavioral difference.
There are 2 1/2 ways around this:
1) Stop using IOSQE_ASYNC heuristic
2a) Wait for all in-flight IOs when any FD gets closed
2b) Wait for all in-flight IOs using FD when it gets closed
Given that we have clear evidence that io_uring doesn't completely support
closing FDs while IOs are in flight, be it a bug or intentional, it seems
clearly better to go for 2a or 2b.
Greetings,
Andres Freund
[1] Instead files are opened when the queue entry is being worked on
instead. Interestingly that only happens when the IO is *explicitly*
requested to be executed in the workqueue with IOSQE_ASYNC, not when it's
put there because it couldn't be done in a non-blocking way.
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2025-03-25 19:08:38 | Re: Allow default \watch interval in psql to be configured |
Previous Message | Robert Haas | 2025-03-25 18:47:52 | Re: why there is not VACUUM FULL CONCURRENTLY? |