Re: WAL insert delay settings

From: Andres Freund <andres(at)anarazel(dot)de>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL insert delay settings
Date: 2019-02-15 18:41:21
Message-ID: 20190215184121.firmhjtcnkejmjxz@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-02-15 08:50:03 -0500, Stephen Frost wrote:
> * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > On 2019-02-14 11:02:24 -0500, Stephen Frost wrote:
> > > On Thu, Feb 14, 2019 at 10:15 Peter Eisentraut <
> > > peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> > > > On 14/02/2019 11:03, Tomas Vondra wrote:
> > > > > But if you add extra sleep() calls somewhere (say because there's also
> > > > > limit on WAL throughput), it will affect how fast VACUUM works in
> > > > > general. Yet it'll continue with the cost-based throttling, but it will
> > > > > never reach the limits. Say you do another 20ms sleep somewhere.
> > > > > Suddenly it means it only does 25 rounds/second, and the actual write
> > > > > limit drops to 4 MB/s.
> > > >
> > > > I think at a first approximation, you probably don't want to add WAL
> > > > delays to vacuum jobs, since they are already slowed down, so the rate
> > > > of WAL they produce might not be your first problem. The problem is
> > > > more things like CREATE INDEX CONCURRENTLY that run at full speed.
> > > >
> > > > That leads to an alternative idea of expanding the existing cost-based
> > > > vacuum delay system to other commands.
> > > >
> > > > We could even enhance the cost system by taking WAL into account as an
> > > > additional factor.
> > >
> > > This is really what I was thinking- let’s not have multiple independent
> > > ways of slowing down maintenance and similar jobs to reduce their impact on
> > > I/o to the heap and to WAL.
> >
> > I think that's a bad idea. Both because the current vacuum code is
> > *terrible* if you desire higher rates because both CPU and IO time
> > aren't taken into account. And it's extremely hard to control. And it
> > seems entirely valuable to be able to limit the amount of WAL generated
> > for replication, but still try go get the rest of the work done as
> > quickly as reasonably possible wrt local IO.
>
> I'm all for making improvements to the vacuum code and making it easier
> to control.
>
> I don't buy off on the argument that there is some way to segregate the
> local I/O question from the WAL when we're talking about these kinds of
> operations (VACUUM, CREATE INDEX, CLUSTER, etc) on logged relations, nor
> do I think we do our users a service by giving them independent knobs
> for both that will undoubtably end up making it more difficult to
> understand and control what's going on overall.
>
> Even here, it seems, you're arguing that the existing approach for
> VACUUM is hard to control; wouldn't adding another set of knobs for
> controlling the amount of WAL generated by VACUUM make that worse? I
> have a hard time seeing how it wouldn't.

I think it's because I see them as, often, having two largely
independent use cases. If your goal is to avoid swamping replication
with WAL, you don't necessarily care about also throttling VACUUM (or
REINDEX, or CLUSTER, or ...)'s local IO. By forcing to combine the two
you just make the whole feature less usable.

I think it'd not be insane to add two things:
- WAL write rate limiting, independent of the vacuum stuff. It'd also be
used by lots of other bulk commands (CREATE INDEX, ALTER TABLE
rewrites, ...)
- Account for WAL writes in the current vacuum costing logic, by
accounting for it using a new cost parameter

Then VACUUM would be throttled by the *minimum* of the two, which seems
to make plenty sense to me, given the usecases.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-02-15 18:53:28 Re: shared-memory based stats collector
Previous Message Petr Jelinek 2019-02-15 18:37:23 Re: Copy function for logical replication slots