Re: GUC for cleanup indexes threshold.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: y(dot)sokolov(at)postgrespro(dot)ru
Cc: sawada(dot)mshk(at)gmail(dot)com, pg(at)bowt(dot)ie, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, david(at)pgmasters(dot)net, amit(dot)kapila16(at)gmail(dot)com, simon(at)2ndquadrant(dot)com, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, pgsql-hackers(at)postgresql(dot)org, kuntalghosh(dot)2007(at)gmail(dot)com, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: GUC for cleanup indexes threshold.
Date: 2017-09-25 12:34:21
Message-ID: 20170925.213421.78149069.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 22 Sep 2017 17:15:08 +0300, Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru> wrote in <c7d736b5ea4cda67a644a0247f1a3951(at)postgrespro(dot)ru>
> On 2017-09-22 16:22, Sokolov Yura wrote:
> > On 2017-09-22 11:21, Masahiko Sawada wrote:
> >> On Fri, Sep 22, 2017 at 4:16 PM, Kyotaro HORIGUCHI
> >> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >>> At Fri, 22 Sep 2017 15:00:20 +0900, Masahiko Sawada
> >>> <sawada(dot)mshk(at)gmail(dot)com> wrote in
> >>> <CAD21AoD6zgb1W6ps1aXj0CcAB_chDYiiTNtEdpMhkefGg13-GQ(at)mail(dot)gmail(dot)com>
> >>>> On Tue, Sep 19, 2017 at 3:31 PM, Kyotaro HORIGUCHI
> >>>> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >>>> > I was just looking the thread since it is found left alone for a
> >>>> > long time in the CF app.
> >>>> >
> >>>> > At Mon, 18 Sep 2017 16:35:58 -0700, Peter Geoghegan <pg(at)bowt(dot)ie> wrote
> >>>> > in
> >>>> > <CAH2-WzkhJhAXD+6DdBp7D8WYLfJ3D0m=AZbGsiw=USUjTmuv-g(at)mail(dot)gmail(dot)com>
> >>>> >> On Wed, Apr 5, 2017 at 3:50 PM, Andres Freund <andres(at)anarazel(dot)de>
> >>>> >> wrote:
> >>>> >> > Hi,
> >>>> >> >
> >>>> >> > On 2017-04-01 03:05:07 +0900, Masahiko Sawada wrote:
> >>>> >> >> On Fri, Mar 31, 2017 at 11:44 PM, Robert Haas
> >>>> >> >> <robertmhaas(at)gmail(dot)com> wrote:
> >>>> >> >> [ lots of valuable discussion ]
> >>>> >> >
> >>>> >> > I think this patch clearly still is in the design stage, and has
> >>>> >> > received plenty feedback this CF. I'll therefore move this to the
> >>>> >> > next
> >>>> >> > commitfest.
> >>>> >>
> >>>> >> Does anyone have ideas on a way forward here? I don't, but then I
> >>>> >> haven't thought about it in detail in several months.
> >>>> >
> >>>> > Is the additional storage in metapage to store the current status
> >>>> > of vaccum is still unacceptable even if it can avoid useless
> >>>> > full-page scan on indexes especially for stable tables?
> >>>> >
> >>>> > Or, how about additional 1 bit in pg_stat_*_index to indicate
> >>>> > that the index *don't* require vacuum cleanup stage. (default
> >>>> > value causes cleanup)
> >>>> You meant that "the next cycle" is the lazy_cleanup_index() function
> >>>> called by lazy_scan_heap()?
> >>> Both finally call btvacuumscan under a certain condition, but
> >>> what I meant by "the next cycle" is the lazy_cleanup_index call
> >>> in the next round of vacuum since abstract layer (index am) isn't
> >>> conscious of the detail of btree.
> >>>
> >>>> > index_bulk_delete (or ambulkdelete) returns the flag in
> >>>> > IndexBulkDeleteResult then lazy_scan_heap stores the flag in
> >>>> > stats and in the next cycle it is looked up to decide the
> >>>> > necessity of index cleanup.
> >>>> >
> >>>> Could you elaborate about this? For example in btree index, the index
> >>>> cleanup skips to scan on the index scan if index_bulk_delete has been
> >>>> called during vacuuming because stats != NULL. So I think we don't
> >>>> need such a flag.
> >>> The flag works so that successive two index full scans don't
> >>> happen in a vacuum round. If any rows are fully deleted, just
> >>> following btvacuumcleanup does nothing.
> >>> I think what you wanted to solve here was the problem that
> >>> index_vacuum_cleanup runs a full scan even if it ends with no
> >>> actual work, when manual or anti-wraparound vacuums. (I'm
> >>> getting a bit confused on this..) It is caused by using the
> >>> pointer "stats" as the flag to instruct to do that. If the
> >>> stats-as-a-flag worked as expected, the GUC doesn't seem to be
> >>> required.
> >> Hmm, my proposal is like that if a table doesn't changed since the
> >> previous vacuum much we skip the cleaning up index.
> >> If the table has at least one garbage we do the lazy_vacuum_index and
> >> then IndexBulkDeleteResutl is stored, which causes to skip doing the
> >> btvacuumcleanup. On the other hand, if the table doesn't have any
> >> garbage but some new tuples inserted since the previous vacuum, we
> >> don't do the lazy_vacuum_index but do the lazy_cleanup_index. In this
> >> case, we always do the lazy_cleanup_index (i.g, we do the full scan)
> >> even if only one tuple is inserted. That's why I proposed a new GUC
> >> parameter which allows us to skip the lazy_cleanup_index in the case.
> >>
> >>> Addition to that, as Simon and Peter pointed out
> >>> index_bulk_delete can leave not-fully-removed pages (so-called
> >>> half-dead pages and pages that are recyclable but not registered
> >>> in FSM, AFAICS) in some cases mainly by RecentGlobalXmin
> >>> interlock. In this case, just inhibiting cleanup scan by a
> >>> threshold lets such dangling pages persist in the index. (I
> >>> conldn't make such a many dangling pages, though..)
> >>> The first patch in the mail (*1) does that. It seems having some
> >>> bugs, though..
> >>> Since the dangling pages persist until autovacuum decided to scan
> >>> the belonging table again, we should run a vacuum round (or
> >>> index_vacuum_cleanup itself) even having no dead rows if we want
> >>> to clean up such pages within a certain period. The second patch
> >>> doesn that.
> >>>
> >> IIUC half-dead pages are not relevant to this proposal. The proposal
> >> has two problems;
> >> * By skipping index cleanup we could leave recyclable pages that are
> >> not marked as a recyclable.
> >> * we stash an XID when a btree page is deleted, which is used to
> >> determine when it's finally safe to recycle the page
> >> Regards,
> >> --
> >> Masahiko Sawada
> >> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> >> NTT Open Source Software Center
> > Here is a small patch that skips scanning btree index if no pending
> > deleted pages exists.
> > It detects this situation by comparing pages_deleted with pages_free.

It seems to work to prevent needless cleanup scans. (Although I
still uncertain how to deal with half-dead pages..)

> > If they are equal, then there is no half-deleted pages, and it is
> > safe to skip next lazy scan.
> > Flag stored in a btpo_flags. It is unset using general wal before
> > scan.

I'm not sure it's good to add a meta-page specific information
into per-page storage.

> > If no half-deleted pages found, it is set without wal (like hint bit).
> > (it is safe to miss setting flag, but it is not safe to miss unsetting
> > flag).
> > This patch works correctly:
> > - if rows are only inserted and never deleted, index always skips
> > scanning (starting with second scan).
(Yes, I was confused here..)
> > - if some rows updated/deleted, then some scans are not skipped. But
> > when all half-deleted pages are marked as deleted, lazy scans start to
> > be skipped.

> > Open question: what to do with index statistic? For simplicity this
> > patch skips updating stats (just returns NULL from btvacuumcleanup).
> > Probably, it should detect proportion of table changes, and do not
> > skip scans if table grows too much.

Agreed.

> Excuse me, I didn't mean to overwrite "last attachment" on commitfest
> page.

I'm the other one who also posted a patch in this thread:p

I'm not sure what we should behave for the case, but I think it's
inevitable to attach a patch when proposing a funcamental change
of the design in a concrete shape. (But this also cofuses the CI
machinery:p)

Actually, it was quite clear what you are thinking.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Ladhe 2017-09-25 12:42:59 Re: Binary search in fmgr_isbuiltin() is a bottleneck.
Previous Message Andrew Dunstan 2017-09-25 12:12:08 Re: visual studio 2017 build support