Re: Why are we PageInit'ing buffers in RelationAddExtraBlocks()?

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Why are we PageInit'ing buffers in RelationAddExtraBlocks()?
Date: 2018-12-19 12:21:31
Message-ID: CAA4eK1+NFFGj3dKeemazFP0dMGM3LwcjC+f8ikeBhjfm+eUz9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 19, 2018 at 2:09 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> The zheap patchset, even after being based on pluggable storage,
> currently has the following condition in RelationAddExtraBlocks():
> if (RelationStorageIsZHeap(relation))
> {
> Assert(BufferGetBlockNumber(buffer) != ZHEAP_METAPAGE);
> ZheapInitPage(page, BufferGetPageSize(buffer));
> freespace = PageGetZHeapFreeSpace(page);
> }
> else
> {
> PageInit(page, BufferGetPageSize(buffer), 0);
> freespace = PageGetHeapFreeSpace(page);
> }
>
> I.e. it initializes the page differently when zheap is used versus
> heap.
>
> Thinking about whether it's worth to allow to extend that function in an
> extensible manner made me wonder: Is it actually a good idea to
> initialize the page at that point, including marking it dirty?
>
> As far as I can tell that that has several downsides:
> - Dirtying the buffer for initialization will cause potentially
> superfluous IO, with no interesting data in the write except for a
> newly initialized page.
> - As there's no sort of interlock, it's entirely possible that, after a
> crash, the blocks will come up empty, but with the FSM returning it as
> as empty, so that path would be good to support anyway.
> - It adds heap specific code to a routine that otherwise could be
> generic for different table access methods
>

IIUC, your proposal is to remove page initialization and
MarkBufferDirty from RelationAddExtraBlocks(), but record them in FSM.
Is my understanding correct, if so, I don't see any problem with that
and as you have mentioned, it will be generally advantageous as well?

> It seems to me, this could be optimized by *not* initializing the page,
> and having a PageIsNew(), check at the places that check whether the
> page is new, and initialize it in that case.
>

makes sense to me.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Massimo Fidanza 2018-12-19 12:42:38 GraalVM
Previous Message David Rowley 2018-12-19 12:17:58 Re: Ordered Partitioned Table Scans