Re: [HACKERS] GUC for cleanup indexes threshold.

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: [HACKERS] GUC for cleanup indexes threshold.
Date: 2018-03-05 22:31:23
Message-ID: CAPpHfdsyX_kOAkcKOcgYtWh0zSjP7i_U85r225mggP4-qpP7OA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 5, 2018 at 5:56 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:

> On Sun, Mar 4, 2018 at 8:59 AM, Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > On Fri, Mar 2, 2018 at 10:53 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> > wrote:
> >>
> >> > 2) In the append-only case, index statistics can lag indefinitely.
> >>
> >> The original proposal proposed a new GUC that specifies a fraction of
> >> the modified pages to trigger a cleanup indexes.
> >
> >
> > Regarding original proposal, I didn't get what exactly it's intended to
> be.
> > You're checking if vacuumed_pages >= nblocks *
> vacuum_cleanup_index_scale.
> > But vacuumed_pages is the variable which could be incremented when
> > no indexes exist on the table. When indexes are present, this variable
> is
> > always
> > zero. I can assume, that it's intended to compare number of pages where
> > at least one tuple is deleted to nblocks * vacuum_cleanup_index_scale.
> > But that is also not an option for us, because we're going to optimize
> the
> > case when exactly zero tuples is deleted by vacuum.
>
> In the latest v4 patch, I compare scanned_pages and the threshold,
> which means if the number of pages that are modified since the last
> vacuum is larger than the threshold we force cleanup index.
>

Right, sorry I've overlooked that. However, if even use number of pages
I would still prefer cumulative measure. So, number of vacuums are
taken into account even if each of them touched only small number of
pages.

> > The thing I'm going to propose is to add estimated number of tuples in
> > table to IndexVacuumInfo. Then B-tree can memorize that number of tuples
> > when last time index was scanned in the meta-page. If pass value
> > is differs from the value in meta-page too much, then cleanup is forced.
> >
> > Any better ideas?
>
> I think that would work. But I'm concerned about metapage format
> compatibility.

That's not show-stopper. B-tree meta page have version number. So,
it's no problem to provide online update.

> And since I've not fully investigated about cleanup
> index of other index types I'm not sure that interface makes sense. It
> might not be better but an alternative idea is to add a condition
> (Irel[i]->rd_rel->relam == BTREE_AM_OID) in lazy_scan_heap.

I meant putting this logic *inside* btvacuumcleanup() while passing
required measure to IndexVacuumInfo which is accessible from
btvacuumcleanup().

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-03-05 22:39:01 pgsql: Clone extended stats in CREATE TABLE (LIKE INCLUDING ALL)
Previous Message Stephen Frost 2018-03-05 22:30:05 Re: PATCH: Configurable file mode mask