RE: [Patch] Optimize dropping of relation buffers using dlist

From: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
To: 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-12-08 06:17:52
Message-ID: OSBPR01MB2341FB5BA04DCD2577E1E47BEFCD0@OSBPR01MB2341.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, December 8, 2020 2:35 PM, Amit Kapila wrote:

> On Tue, Dec 8, 2020 at 10:41 AM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> >
> > At Tue, 8 Dec 2020 08:08:25 +0530, Amit Kapila
> > <amit(dot)kapila16(at)gmail(dot)com> wrote in
> > > On Tue, Dec 8, 2020 at 7:24 AM Kyotaro Horiguchi
> > > <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > > > We drop
> > > > buffers for the old relfilenode on truncation anyway.
> > > >
> > > > What I did is:
> > > >
> > > > a: Create a physical replication pair.
> > > > b: On the master, create a table. (without explicitly starting a
> > > > tx)
> > > > c: On the master, insert a tuple into the table.
> > > > d: On the master truncate the table.
> > > >
> > > > On the standby, smgrnblocks is called for the old relfilenode of
> > > > the table at c, then the same function is called for the same
> > > > relfilenode at d and the function takes the cached path.
> > > >
> > >
> > > This is on the lines I have tried for recovery. So, it seems we are
> > > in agreement that we can use the 'cached' flag in
> > > DropRelFileNodesAllBuffers and it will take the optimized path in
> > > many such cases, right?
> >
> >
> > Mmm. There seems to be a misunderstanding.. What I opposed to is
> > referring only to InRecovery and ignoring the value of "cached".
> >
>
> Okay, I think it was Kirk-San who proposed to use InRecovery and ignoring
> the value of "cached" based on the theory that even if Insert (or other DMLs)
> are done before Truncate, it won't use an optimized path and I don't agree
> with the same. So, I did a small test to check the same and found that it
> should use the optimized path and the same is true for the experiment done
> by you. I am not sure why Kirk-San is seeing something different?
>
> > The remaining issue is we don't get to the optimized path when a
> > standby makes the first call to smgrnblocks() when truncating a
> > relation. Still we can get to the optimized path as far as any
> > update(+insert) or select is performed earlier on the relation so I
> > think it doesn't matter so match.
> >
>
> +1.

My question/proposal before was to either use InRecovery,
or completely drop the smgrnblocks' "cached" flag.
But that is coming from the results of my investigation below when
I used "cached" in DropRelFileNodesAllBuffers().
The optimization path was skipped because one of the
Rels' "cached" value was "false".

Test Case. (shared_buffer = 1GB)
0. Set physical replication to both master and standby.
1. Create 1 table.
2. Insert Data (1MB) to TABLE.
16385 is the relnode for insert (both Master and Standby).

3. Pause WAL on Standby.
4. TRUNCATE table on Primary.
nrels = 3. relNodes 16389, 16388, 16385.

5. Stop Primary.

6. Promote standby and resume WAL recovery. nrels = 3
1st rel's check for optimization: "cached" is TRUE. relNode = 16389.
2nd rel's check for optimization. "cached" was returned FALSE by
smgrnblocks). relNode = 16388.
Since one of the rels' cached is "FALSE", the optimization check for
3rd relation and the whole optimization itself is skipped.
Go to full-scan path in DropRelFileNodesAllBuffers().
Then smgrclose for relNodes 16389, 16388, 16385.

Because one of the rel's cached value was false, it forced the
full-scan path for TRUNCATE.
Is there a possible workaround for this?

Regards,
Kirk Jamison

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2020-12-08 06:22:49 Re: Single transaction in the tablesync worker?
Previous Message Amit Kapila 2020-12-08 05:35:09 Re: [Patch] Optimize dropping of relation buffers using dlist