From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> |
Subject: | Re: GUC for cleanup indexes threshold. |
Date: | 2017-09-22 08:21:04 |
Message-ID: | CAD21AoBN9ucgMDuinx2ptU8upEToHnR-A35aBcQyZnLFvWdVPg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 22, 2017 at 4:16 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> At Fri, 22 Sep 2017 15:00:20 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoD6zgb1W6ps1aXj0CcAB_chDYiiTNtEdpMhkefGg13-GQ(at)mail(dot)gmail(dot)com>
>> On Tue, Sep 19, 2017 at 3:31 PM, Kyotaro HORIGUCHI
>> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> > I was just looking the thread since it is found left alone for a
>> > long time in the CF app.
>> >
>> > At Mon, 18 Sep 2017 16:35:58 -0700, Peter Geoghegan <pg(at)bowt(dot)ie> wrote in <CAH2-WzkhJhAXD+6DdBp7D8WYLfJ3D0m=AZbGsiw=USUjTmuv-g(at)mail(dot)gmail(dot)com>
>> >> On Wed, Apr 5, 2017 at 3:50 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> >> > Hi,
>> >> >
>> >> > On 2017-04-01 03:05:07 +0900, Masahiko Sawada wrote:
>> >> >> On Fri, Mar 31, 2017 at 11:44 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> >> >> [ lots of valuable discussion ]
>> >> >
>> >> > I think this patch clearly still is in the design stage, and has
>> >> > received plenty feedback this CF. I'll therefore move this to the next
>> >> > commitfest.
>> >>
>> >> Does anyone have ideas on a way forward here? I don't, but then I
>> >> haven't thought about it in detail in several months.
>> >
>> > Is the additional storage in metapage to store the current status
>> > of vaccum is still unacceptable even if it can avoid useless
>> > full-page scan on indexes especially for stable tables?
>> >
>> > Or, how about additional 1 bit in pg_stat_*_index to indicate
>> > that the index *don't* require vacuum cleanup stage. (default
>> > value causes cleanup)
>>
>> You meant that "the next cycle" is the lazy_cleanup_index() function
>> called by lazy_scan_heap()?
>
> Both finally call btvacuumscan under a certain condition, but
> what I meant by "the next cycle" is the lazy_cleanup_index call
> in the next round of vacuum since abstract layer (index am) isn't
> conscious of the detail of btree.
>
>> > index_bulk_delete (or ambulkdelete) returns the flag in
>> > IndexBulkDeleteResult then lazy_scan_heap stores the flag in
>> > stats and in the next cycle it is looked up to decide the
>> > necessity of index cleanup.
>> >
>>
>> Could you elaborate about this? For example in btree index, the index
>> cleanup skips to scan on the index scan if index_bulk_delete has been
>> called during vacuuming because stats != NULL. So I think we don't
>> need such a flag.
>
> The flag works so that successive two index full scans don't
> happen in a vacuum round. If any rows are fully deleted, just
> following btvacuumcleanup does nothing.
>
> I think what you wanted to solve here was the problem that
> index_vacuum_cleanup runs a full scan even if it ends with no
> actual work, when manual or anti-wraparound vacuums. (I'm
> getting a bit confused on this..) It is caused by using the
> pointer "stats" as the flag to instruct to do that. If the
> stats-as-a-flag worked as expected, the GUC doesn't seem to be
> required.
Hmm, my proposal is like that if a table doesn't changed since the
previous vacuum much we skip the cleaning up index.
If the table has at least one garbage we do the lazy_vacuum_index and
then IndexBulkDeleteResutl is stored, which causes to skip doing the
btvacuumcleanup. On the other hand, if the table doesn't have any
garbage but some new tuples inserted since the previous vacuum, we
don't do the lazy_vacuum_index but do the lazy_cleanup_index. In this
case, we always do the lazy_cleanup_index (i.g, we do the full scan)
even if only one tuple is inserted. That's why I proposed a new GUC
parameter which allows us to skip the lazy_cleanup_index in the case.
>
> Addition to that, as Simon and Peter pointed out
> index_bulk_delete can leave not-fully-removed pages (so-called
> half-dead pages and pages that are recyclable but not registered
> in FSM, AFAICS) in some cases mainly by RecentGlobalXmin
> interlock. In this case, just inhibiting cleanup scan by a
> threshold lets such dangling pages persist in the index. (I
> conldn't make such a many dangling pages, though..)
>
> The first patch in the mail (*1) does that. It seems having some
> bugs, though..
>
>
> Since the dangling pages persist until autovacuum decided to scan
> the belonging table again, we should run a vacuum round (or
> index_vacuum_cleanup itself) even having no dead rows if we want
> to clean up such pages within a certain period. The second patch
> doesn that.
>
IIUC half-dead pages are not relevant to this proposal. The proposal
has two problems;
* By skipping index cleanup we could leave recyclable pages that are
not marked as a recyclable.
* we stash an XID when a btree page is deleted, which is used to
determine when it's finally safe to recycle the page
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro HORIGUCHI | 2017-09-22 08:31:10 | Re: GUC for cleanup indexes threshold. |
Previous Message | Pavel Stehule | 2017-09-22 08:21:01 | Re: [PROPOSAL] Temporal query processing with range types |