From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Noah Misch <noah(at)leadboat(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Direct I/O |
Date: | 2023-04-09 02:56:53 |
Message-ID: | CA+hUKG+F8HgEoimV-42bFXJbsd4yeQ6DF1VEc2LZ4bB-OfcV6Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Apr 9, 2023 at 2:18 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-04-09 13:55:33 +1200, Thomas Munro wrote:
> > I think that particular thing might relate to modifications of the
> > user buffer while a write is in progress (breaking btrfs's internal
> > checksums). I don't think we should ever do that ourselves (not least
> > because it'd break our own checksums). We lock the page during the
> > write so no one can do that, and then we sleep in a synchronous
> > syscall.
>
> Oh, but we actually *do* modify pages while IO is going on. I wonder if you
> hit the jack pot here. The content lock doesn't prevent hint bit
> writes. That's why we copy the page to temporary memory when computing
> checksums.
More like the jackpot hit me.
Woo, I can now reproduce this locally on a loop filesystem.
Previously I had missed a step, the parallel worker seems to be
necessary. More soon.
From | Date | Subject | |
---|---|---|---|
Next Message | Gregory Stark (as CFM) | 2023-04-09 03:01:24 | Re: Use fadvise in wal replay |
Previous Message | Andres Freund | 2023-04-09 02:18:09 | Re: Direct I/O |