From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Antonin Houska <ah(at)cybertec(dot)at> |
Subject: | Re: AIO v2.5 |
Date: | 2025-04-07 16:20:15 |
Message-ID: | 4nervqmqplfr23jrjvkp5tsumi6qgouhgjqlubf7ujrudw2epb@6mszddainl4u |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2025-04-06 23:00:00 +0300, Alexander Lakhin wrote:
> 02.04.2025 14:58, Andres Freund wrote:
> When running multiple installcheck's against a single server (please find
> the ready-to-use script attached (I use more sophisticated version with
> additional patches to make installcheck pass cleanly, but that's not
> required for this case)), I've encountered an interesting error related to
> AIO/uring:
> iteration 8: Sun Apr 6 19:22:39 UTC 2025
> installchecks finished: Sun Apr 6 19:23:47 UTC 2025
> 2025-04-06 19:22:44.216 UTC [349525] LOG: could not read blocks 0..0 in file "base/6179194/2606": Operation canceled
> 2025-04-06 19:22:44.216 UTC [349525] ERROR: could not read blocks 0..0 in file "base/6179194/2606": Operation canceled
Thanks for the report, clearly something isn't right.
> It's reproduced better on tmpfs for me; probably you would need to increase
> NUM_INSTALLCHECKS/NUM_ITERATIONS for your machine.
I ran it for a while in a VM, it hasn't triggered yet. Neither on xfs nor on
tmpfs.
> server.log contains:
> 2025-04-06 19:22:44.215 UTC [38231] LOG: checkpoint complete: wrote ...
> 2025-04-06 19:22:44.216 UTC [38231] LOG: checkpoint starting: immediate force wait flush-all
> 2025-04-06 19:22:44.216 UTC [349525] LOG: could not read blocks 0..0 in file "base/6179194/2606": Operation canceled
> 2025-04-06 19:22:44.216 UTC [349525] STATEMENT: alter table parted_copytest
> attach partition parted_copytest_a1 for values in(1);
> 2025-04-06 19:22:44.216 UTC [349525] ERROR: could not read blocks 0..0 in file "base/6179194/2606": Operation canceled
> 2025-04-06 19:22:44.216 UTC [349525] STATEMENT: alter table parted_copytest
> attach partition parted_copytest_a1 for values in(1);
Hm. Does the failure vary between occurrences?
- is it always the same statement? Probably not?
- is it always 2606 (i.e. pg_constraint)?
- does the failure always happen around a checkpoint? If so, is it always
immediate?
- I do assume it's always ECANCELED?
> I can reduce the testing procedure to something trivial, if it makes sense
> for you. Probably, the same effect can be also achieved with just pgbench...
That'd be very helpful!
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2025-04-07 16:23:06 | Re: Logging which local address was connected to in log_line_prefix |
Previous Message | Tom Lane | 2025-04-07 15:59:48 | Re: Logging which local address was connected to in log_line_prefix |