Re: [Patch] Optimize dropping of relation buffers using dlist

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: tsunakawa(dot)takay(at)fujitsu(dot)com
Cc: k(dot)jamison(at)fujitsu(dot)com, amit(dot)kapila16(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, tomas(dot)vondra(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-10-09 02:12:01
Message-ID: 20201009.111201.608064254389053226.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 9 Oct 2020 00:41:24 +0000, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com> wrote in
> From: Jamison, Kirk/ジャミソン カーク <k(dot)jamison(at)fujitsu(dot)com>
> > > (6)
> > > + bufHdr->tag.blockNum >=
> > > firstDelBlock[j])
> > > + InvalidateBuffer(bufHdr); /*
> > > releases spinlock */
> > >
> > > The right side of >= should be cur_block.
> >
> > Fixed.
>
> >= should be =, shouldn't it?
>
> Please measure and let us see just the recovery performance again because the critical part of the patch was modified. If the performance is good as the previous one, and there's no review interaction with others in progress, I'll mark the patch as ready for committer in a few days.

The performance is expected to be kept since smgrnblocks() is called
in a non-hot code path and actually it is called at most four times
per a buffer drop in this patch. But it's better making it sure.

I have some comments on the latest patch.

@@ -445,6 +445,7 @@ BlockNumber
visibilitymap_prepare_truncate(Relation rel, BlockNumber nheapblocks)
{
BlockNumber newnblocks;
+ bool cached;

All the added variables added by 0002 is useless because all the
caller sites are not interested in the value. smgrnblocks should
accept NULL as isCached. (I'm agree with Tsunakawa-san that the
camel-case name is not common there.)

+ nForkBlocks[i] = smgrnblocks(smgr_reln, forkNum[i], &isCached);
+
+ if (!isCached)

"is cached" is not the property that code is interested in. No other callers to smgrnblocks are interested in that property. The need for caching is purely internal of smgrnblocks().

On the other hand, we are going to utilize the property of "accuracy"
that is a biproduct of reducing fseek calls, and, again, not
interested in how it is achieved.

So I suggest that the name should be "accurite" or something that is
not suggest the mechanism used under the hood.

+ if (nTotalBlocks != InvalidBlockNumber &&
+ nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD)

I don't think nTotalBlocks is useful. What we need here is only total
blocks for every forks (nForkBlocks[]) and the total number of buffers
to be invalidated for all forks (nBlocksToInvalidate).

> > > The right side of >= should be cur_block.
> >
> > Fixed.
>
> >= should be =, shouldn't it?

It's just from a paranoia. What we are going to invalidate is blocks
blockNum of which >= curBlock. Although actually there's no chance of
any other processes having replaced the buffer with another page (with
lower blockid) of the same relation after BugTableLookup(), that
condition makes it sure not to leave blocks to be invalidated left
alone.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-10-09 02:24:46 Re: [Patch] Optimize dropping of relation buffers using dlist
Previous Message Tom Lane 2020-10-09 02:06:25 Re: Assertion failure with LEFT JOINs among >500 relations