Re: WAL insert delay settings

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL insert delay settings
Date: 2019-02-18 17:12:36
Message-ID: 20190218171236.GI6197@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
> On 2/15/19 7:41 PM, Andres Freund wrote:
> > On 2019-02-15 08:50:03 -0500, Stephen Frost wrote:
> >> * Andres Freund (andres(at)anarazel(dot)de) wrote:
> >>> On 2019-02-14 11:02:24 -0500, Stephen Frost wrote:
> >>>> On Thu, Feb 14, 2019 at 10:15 Peter Eisentraut <
> >>>> peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> >>>>> On 14/02/2019 11:03, Tomas Vondra wrote:
> >>>>>> But if you add extra sleep() calls somewhere (say because there's also
> >>>>>> limit on WAL throughput), it will affect how fast VACUUM works in
> >>>>>> general. Yet it'll continue with the cost-based throttling, but it will
> >>>>>> never reach the limits. Say you do another 20ms sleep somewhere.
> >>>>>> Suddenly it means it only does 25 rounds/second, and the actual write
> >>>>>> limit drops to 4 MB/s.
> >>>>>
> >>>>> I think at a first approximation, you probably don't want to add WAL
> >>>>> delays to vacuum jobs, since they are already slowed down, so the rate
> >>>>> of WAL they produce might not be your first problem. The problem is
> >>>>> more things like CREATE INDEX CONCURRENTLY that run at full speed.
> >>>>>
> >>>>> That leads to an alternative idea of expanding the existing cost-based
> >>>>> vacuum delay system to other commands.
> >>>>>
> >>>>> We could even enhance the cost system by taking WAL into account as an
> >>>>> additional factor.
> >>>>
> >>>> This is really what I was thinking- let’s not have multiple independent
> >>>> ways of slowing down maintenance and similar jobs to reduce their impact on
> >>>> I/o to the heap and to WAL.
> >>>
> >>> I think that's a bad idea. Both because the current vacuum code is
> >>> *terrible* if you desire higher rates because both CPU and IO time
> >>> aren't taken into account. And it's extremely hard to control. And it
> >>> seems entirely valuable to be able to limit the amount of WAL generated
> >>> for replication, but still try go get the rest of the work done as
> >>> quickly as reasonably possible wrt local IO.
> >>
> >> I'm all for making improvements to the vacuum code and making it easier
> >> to control.
> >>
> >> I don't buy off on the argument that there is some way to segregate the
> >> local I/O question from the WAL when we're talking about these kinds of
> >> operations (VACUUM, CREATE INDEX, CLUSTER, etc) on logged relations, nor
> >> do I think we do our users a service by giving them independent knobs
> >> for both that will undoubtably end up making it more difficult to
> >> understand and control what's going on overall.
> >>
> >> Even here, it seems, you're arguing that the existing approach for
> >> VACUUM is hard to control; wouldn't adding another set of knobs for
> >> controlling the amount of WAL generated by VACUUM make that worse? I
> >> have a hard time seeing how it wouldn't.
> >
> > I think it's because I see them as, often, having two largely
> > independent use cases. If your goal is to avoid swamping replication
> > with WAL, you don't necessarily care about also throttling VACUUM
> > (or REINDEX, or CLUSTER, or ...)'s local IO. By forcing to combine
> > the two you just make the whole feature less usable.
>
> I agree with that.

I can agree that they're different use-cases but one does end up
impacting the other and that's what I had been thinking about from the
perspective of "if we could proivde just one knob for this."

VACUUM is a pretty good example- if we're dirtying a page with VACUUM
then we're also writing that page into the WAL (at least, if the
relation isn't unlogged). Now, VACUUM does do other things (such as
read pages), as does REINDEX or CLUSTER, so maybe there's a way to think
about this feature in those terms- cost for doing local read i/o, vs.
cost for doing write i/o (to both heap and WAL) vs. cost for doing
"local" write i/o (just to heap, ie: unlogged tables).

What I was trying to say I didn't like previously was the idea of having
a "local i/o write" cost for VACUUM, to a logged table, *and* a "WAL
write" cost for VACUUM, since those are very tightly correlated.

The current costing mechanism in VACUUM only provides the single hammer
of "if we hit the limit, go to sleep for a while" which seems a bit
unfortunate- if we haven't hit the "read i/o" limit, it'd be nice if we
could keep going and then come back to writing out pages when enough
time has passed that we're below our "write i/o" limit. That would end
up requiring quite a bit of change to how we do things though, I expect,
so probably not something to tie into this particular feature but I
wanted to express the thought in case others found it interesting.

> > I think it'd not be insane to add two things:
> > - WAL write rate limiting, independent of the vacuum stuff. It'd also be
> > used by lots of other bulk commands (CREATE INDEX, ALTER TABLE
> > rewrites, ...)
> > - Account for WAL writes in the current vacuum costing logic, by
> > accounting for it using a new cost parameter
> >
> > Then VACUUM would be throttled by the *minimum* of the two, which seems
> > to make plenty sense to me, given the usecases.
>
> Is it really minimum? If you add another cost parameter to the vacuum
> model, then there's almost no chance of actually reaching the limit
> because the budget (cost_limit) is shared with other stuff (local I/O).

Yeah, that does seem like it'd be an issue.

> FWIW I do think the ability to throttle WAL is a useful feature, I just
> don't want to shoot myself in the foot by making other things worse.
>
> As you note, the existing VACUUM throttling is already hard to control,
> this seems to make it even harder.

Agreed.

Thanks!

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-18 17:27:35 Re: speeding up planning with partitions
Previous Message David Steele 2019-02-18 17:07:29 Re: 2019-03 CF Summary / Review - Tranche #2