RE: [Patch] Optimize dropping of relation buffers using dlist

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-09-25 08:55:03
Message-ID: TYAPR01MB2990F96C978ACACED2DD57AAFE360@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> No, during recovery also we need to be careful. We need to ensure that
> we use cached value during recovery and cached value is always
> up-to-date. We can't rely on lseek and I have provided some scenario
> up thread [1] where such behavior can cause problem and then see the
> response from Tom Lane why the same can be true for recovery as well.
>
> The basic approach we are trying to pursue here is to rely on the
> cached value of 'number of blocks' (as that always gives correct value
> and even if there is a problem that will be our bug, we don't need to
> rely on OS for correct value and it will be better w.r.t performance
> as well). It is currently only possible during recovery so we are
> using it in recovery path and later once Thomas's patch to cache it
> for non-recovery cases is also done, we can use it for non-recovery
> cases as well.

Although I may be still confused, I understood that Kirk-san's patch should:

* Still focus on speeding up the replay of TRUNCATE during recovery.

* During recovery, DropRelFileNodeBuffers() gets the cached size of the relation fork. If it is cached, trust it and optimize the buffer invalidation. If it's not cached, we can't trust the return value of smgrnblocks() because it's the lseek(END) return value, so we avoid the optimization.

* Then, add a new function, say, smgrnblocks_cached() that simply returns the cached block count, and DropRelFileNodeBuffers() uses it instead of smgrnblocks().

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Katsuragi Yuta 2020-09-25 08:55:35 enable pg_stat_statements to track rows processed by REFRESH MATERIALIZED VIEW
Previous Message Hou, Zhijie 2020-09-25 08:49:57 AppendStringInfoChar instead of appendStringInfoString