Quick Links

Re: refactoring relation extension and BufferAlloc(), faster COPY

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: refactoring relation extension and BufferAlloc(), faster COPY
Date:	2023-03-01 17:02:00
Message-ID:	20230301170200.wsag6s425oav7gkg@awork3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2023-03-01 11:12:35 +0200, Heikki Linnakangas wrote:
> On 27/02/2023 23:45, Andres Freund wrote:
> > But, uh, isn't this code racy? Because this doesn't go through shared buffers,
> > there's no IO_IN_PROGRESS interlocking against a concurrent reader. We know
> > that writing pages isn't atomic vs readers. So another connection could
> > connection could see the new relation size, but a read might return a
> > partially written state of the page. Which then would cause checksum
> > failures. And even worse, I think it could lead to loosing a write, if the
> > concurrent connection writes out a page.
>
> fsm_readbuf and vm_readbuf check the relation size first, with
> smgrnblocks(), before trying to read the page. So to have a problem, the
> smgrnblocks() would have to already return the new size, but the smgrread()
> would not return the new contents. I don't think that's possible, but not
> sure.

I hacked Thomas' program to test torn reads to ftruncate the file on the write
side.

It frequently observes a file size that's not the write size (e.g. reading 4k
when writing an 8k block).

After extending the test to more than one reader, I indeed also see torn
reads. So far all the tears have been at a 4k block boundary. However so far
it always has been *prior* page contents, not 0s.

Greetings,

Andres Freund

In response to

Re: refactoring relation extension and BufferAlloc(), faster COPY at 2023-03-01 09:12:35 from Heikki Linnakangas

Responses

Re: refactoring relation extension and BufferAlloc(), faster COPY at 2023-03-01 17:25:03 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Justin Pryzby	2023-03-01 17:08:17	Re: Add LZ4 compression in pg_dump
Previous Message	Tom Lane	2023-03-01 17:00:49	Re: The order of queues in row lock is changed (not FIFO)