Quick Links

Re: Relation extension scalability

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Relation extension scalability
Date:	2015-03-30 01:05:04
Message-ID:	10634.1427677504@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I thought the primary reason we did this is because we wanted to
> write-and-fsync the block so that, if we're out of disk space, any
> attendant failure will happen before we put data into the block. Once
> we've initialized the block, a subsequent failure to write or fsync it
> will be hard to recover from; basically, we won't be able to
> checkpoint any more. If we discover the problem while the block is
> still all-zeroes, the transaction that uncovers the problem errors
> out, but the system as a whole is still OK.

Yeah. As Andres says, the fsync is not an important part of that,
but we do expect that ENOSPC will happen during the initial write()
if it's going to happen.

To some extent that's an obsolete assumption, I'm afraid --- I believe
that some modern filesystems don't necessarily overwrite the previous
version of a block, which would mean that they are capable of failing
with ENOSPC even during a re-write of a previously-written block.
However, the possibility of filesystem misfeasance of that sort doesn't
excuse us from having a clear recovery strategy for failures during
relation extension.

regards, tom lane

In response to

Re: Relation extension scalability at 2015-03-30 00:02:06 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2015-03-30 04:03:57	Re: Relation extension scalability
Previous Message	Andres Freund	2015-03-30 00:47:09	Re: Relation extension scalability