From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Noah Misch <noah(at)leadboat(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Direct I/O |
Date: | 2023-04-08 21:15:34 |
Message-ID: | CA+hUKGJ2JqN1O=kfdbZfVZKpTCkZXY4=nMwc1U4xe39YE66GTw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Apr 9, 2023 at 9:10 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> 2023-04-08 16:50:03.177 EDT [2023-04-08 16:50:03 EDT 3257645:3] 004_io_direct.pl LOG: statement: select count(*) from t1
> 2023-04-08 16:50:03.316 EDT [2023-04-08 16:50:03 EDT 3257646:1] ERROR: invalid page in block 56 of relation base/5/16384
> The fact that the error is happening in a parallel worker seems
> interesting ...
That's because it's running with debug_parallel_query=regress. I've
been trying to repro that but no luck... A different kind of failure
also showed up, where it counted the wrong number of tuples:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2023-04-08%2015%3A52%3A03
A paranoid explanation would be that this system is failing to provide
basic I/O coherency, we're writing pages out and not reading them back
in. Or of course there is a dumb bug... but why only here? Can of
course be timing-sensitive and it's interesting that crake suffers
from the "no unpinned buffers available" thing (which should now be
gone) with higher frequency; I'm keen to see if the dodgy-read problem
continues with a similar frequency now.
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2023-04-08 21:23:37 | Re: Direct I/O |
Previous Message | Tom Lane | 2023-04-08 21:10:19 | Re: Direct I/O |