Re: new autovacuum criterion for visible pages

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new autovacuum criterion for visible pages
Date: 2017-02-03 18:55:48
Message-ID: CAMkU=1xNO_WrDgvA5s99N_7_KmNzP51WUQLM0n-ZaCLGmUNgTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 21, 2017 at 1:57 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> All,
>
> * Simon Riggs (simon(at)2ndquadrant(dot)com) wrote:
> > On 12 August 2016 at 01:01, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> > >> In short, autovacuum will need to scan by itself the VM of each
> > >> relation and decide based on that.
> > >
> > > That seems like a worthwhile approach to pursue. The VM is supposed
> to be
> > > small, and if you're worried it isn't, you could sample a few pages of
> it.
> > > I do not think any of the ideas proposed so far for tracking the
> > > visibility percentage on-the-fly are very tenable.
> >
> > Sounds good, but we can't scan the VM for every table, every minute.
> > We need to record something that will tell us how many VM bits have
> > been cleared, which will then allow autovac to do a simple SELECT to
> > decide what needs vacuuming.
> >
> > Vik's proposal to keep track of the rows inserted seems like the best
> > approach to this issue.
>
> I tend to agree with Simon on this. I'm also worried that an approach
> which was based off of a metric like "% of table not all-visible" might
> result in VACUUM running over and over on a table because it isn't able
> to actually make any progress towards improving that percentage. We'd
> have to have some kind of "cool-off" period or something.
>

This is why I didn't want a global guc for it but only a relopt. I figured
I would set it only for tables which I have good reason to know would
benefit, because I know that they are both candidates for beneficial IOS,
and because it is possible to keep them mostly all-visible with a
reasonable amount of vacuum work. I think that this is a small but
important subset of tables (currently, I personally have zero such tables,
but that could increase if covering indexes get implemented). It would be
nice to have settings that users of all experience level would understand
(or no settings at all), but I don't think that it is feasible to have that
without compromising the basic functionality.

> Tracking INSERTs and then kicking off a VACUUM based on them seems to
> address that in a natural way and also seems like something that users
> would generally understand as it's very similar to what we do for
> UPDATEs and DELETEs.
>

If we do an INSERT based count with an extra knob to control how that gets
weighted when added to the vacuum threshold function, then I could use that
knob to micromanage to my satisfaction. If there is no knob added, then I
think that I and many other people are going to see their vacuum workload
skyrocket for no benefit and with no recourse, other than disabling autovac.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-02-03 19:09:30 Re: Packages: Again
Previous Message Alvaro Herrera 2017-02-03 18:42:49 Re: Index corruption with CREATE INDEX CONCURRENTLY