From: | "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com> |
---|---|
To: | "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: [Patch] Optimize dropping of relation buffers using dlist |
Date: | 2020-09-28 07:29:57 |
Message-ID: | OSBPR01MB23417A2B3663D884EFA8EC85EF350@OSBPR01MB2341.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Monday, September 28, 2020 11:50 AM, Tsunakawa-san wrote:
> From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > I agree with the above two points.
>
> Thank you. I'm relieved to know I didn't misunderstand.
>
>
> > > * Then, add a new function, say, smgrnblocks_cached() that simply
> > > returns
> > the cached block count, and DropRelFileNodeBuffers() uses it instead
> > of smgrnblocks().
> > >
> >
> > I am not sure if it worth adding a new function for this. Why not
> > simply add a boolean variable in smgrnblocks for this?
>
>
> One reason is that adding an argument requires modification of existing call
> sites (10 + a few). Another is that, although this may be different for each
> person's taste, it's sometimes not easy to understand when a function call
> with true/false appears. One such example is find_XXX(some_args,
> true/false), where the true/false represents missing_ok. Another example is
> as follows. I often wonder "what's the meaning of this false, and that true?"
>
> if (!InstallXLogFileSegment(&destsegno, tmppath, false, 0, false))
> elog(ERROR, "InstallXLogFileSegment should not have failed");
>
> Fortunately, the new function is very short and doesn't duplicate much code.
> The function is a simple getter and the function name can convey the
> meaning straight (if the name is good.)
>
>
> > BTW, AFAICS, the latest patch
> > doesn't have code to address this point.
>
> Kirk-san, can you address this? I don't mind much if you add an argument
> or a new function.
I maybe missing something. so I'd like to check if my understanding is correct,
as I'm confused with what do we mean exactly by "cached value of nblocks".
Discussed upthread, smgrnblocks() does not always guarantee that it returns a
"cached" nblocks even in recovery.
When we enter this path in recovery path of DropRelFileNodeBuffers,
according to Tsunakawa-san:
>> * During recovery, DropRelFileNodeBuffers() gets the cached size of the relation fork. If it is cached, trust it and optimize the buffer invalidation. If it's not cached, we can't trust the return value of smgrnblocks() because it's the lseek(END) return value, so we avoid the optimization.
+ nTotalBlocks = smgrnblocks(smgr_reln, forkNum[j]);
But this comment in the smgrnblocks source code:
* For now, we only use cached values in recovery due to lack of a shared
* invalidation mechanism for changes in file size.
*/
if (InRecovery && reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber)
return reln->smgr_cached_nblocks[forknum];
So the nblocks returned in DropRelFileNodeBuffers are still not guaranteed to be "cached values"?
And that we want to add a new function (I think it's the lesser complicated way than modifying smgrnblocks):
/*
* smgrnblocksvalid() -- Calculate the number of blocks that are cached in
* the supplied relation.
*
* It is equivalent to calling smgrnblocks, but only used in recovery for now
* when DropRelFileNodeBuffers() is called, to ensure that only cached value
* is used, which is always valid.
*
* This returns an InvalidBlockNumber when smgr_cached_nblocks is not available
* and when isCached is false.
*/
BlockNumber
smgrnblocksvalid(SMgrRelation reln, ForkNumber forknum, bool isCached)
{
BlockNumber result;
/*
* For now, we only use cached values in recovery due to lack of a shared
* invalidation mechanism for changes in file size.
*/
if (InRecovery && if reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber
&& isCached)
return reln->smgr_cached_nblocks[forknum];
}
result = smgrsw[reln->smgr_which].smgr_nblocks(reln, forknum);
reln->smgr_cached_nblocks[forknum] = result;
if (!InRecovery && !isCached)
return InvalidBlockNumber;
return result;
}
Then in DropRelFileNodeBuffers
+ nTotalBlocks = smgrcachednblocks(smgr_reln, forkNum[j], true);
Is my understanding above correct?
Regards,
Kirk Jamison
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2020-09-28 07:33:30 | __pg_log_level in anonynous enum should be initialized? (Was: pgsql: Change SHA2 implementation based on OpenSSL to use EVP digest ro) |
Previous Message | Andy Fan | 2020-09-28 07:22:49 | Re: Partition prune with stable Expr |