Re: autovacuum scheduling starvation and frenzy

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum scheduling starvation and frenzy
Date: 2014-10-01 15:31:55
Message-ID: CA+TgmoYz=skw1A+AwKFnsMMRVePE_Go+ceY6d+WPqtni7_iQmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 30, 2014 at 5:59 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Jeff Janes wrote:
>> > I think that instead of
>> > trying to get a single target database in that foreach loop, we could
>> > try to build a prioritized list (in-wraparound-danger first, then
>> > in-multixid-wraparound danger, then the one with the oldest autovac time
>> > of all the ones that remain); then recheck the wrap-around condition by
>> > seeing whether there are other workers in that database that started
>> > after the wraparound condition appeared.
>>
>> I think we would want to check for one worker that is still running, and at
>> least one other worker that started and completed since the wraparound
>> threshold was exceeded. If there are multiple tables in the database that
>> need full scanning, it would make sense to have multiple workers. But if a
>> worker already started and finished without increasing the frozenxid and,
>> another attempt probably won't accomplish much either. But I have no idea
>> how to do that bookkeeping, or how much of an improvement it would be over
>> something simpler.
>
> How about something like this:
>
> * if autovacuum is disabled, then don't check these conditions; the only
> reason we're in do_start_worker() in that case is that somebody
> signalled postmaster that some database needs a for-wraparound emergency
> vacuum.
>
> * if autovacuum is on, and the database was processed less than
> autovac_naptime/2 ago, and there are no workers running in that database
> now, then ignore the database.
>
> Otherwise, consider it for xid-wraparound vacuuming. So if we launched
> a worker recently, but it already finished, we would start another one.
> (If the worker finished, the database should not be in need of a
> for-wraparound vacuum again, so this seems sensible). Also, we give
> priority to a database in danger sooner than the full autovac_naptime
> period; not immediately after the previous worker started, which should
> give room for other databases to be processed.
>
> The attached patch implements that. I only tested it on HEAD, but
> AFAICS it applies cleanly to 9.4 and 9.3; fairly sure it won't apply to
> 9.2. Given the lack of complaints, I'm unsure about backpatching
> further back than 9.3 anyway.

This kind of seems like throwing darts at the wall. It could be
better if we are right to skip the database already being vacuumed for
wraparound, or worse if we're not.

I'm not sure that we should do this at all, or at least not without
testing it extensively first. We could easily shoot ourselves in the
foot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2014-10-01 15:41:24 Re: [v9.5] Custom Plan API
Previous Message Robert Haas 2014-10-01 15:16:11 Re: Per table autovacuum vacuum cost limit behaviour strange