Re: Database 'template1' vacuum

From: Natalie Wenz <nataliewenz(at)ebureau(dot)com>
To: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Cc: Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: Database 'template1' vacuum
Date: 2016-07-26 19:36:11
Message-ID: DDFF9509-FB63-49C4-AAC8-F5813DD8B00E@ebureau.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin


> On Jul 26, 2016, at 1:20 PM, Kevin Grittner <kgrittn(at)gmail(dot)com> wrote:
>
> On Tue, Jul 26, 2016 at 1:08 PM, Natalie Wenz
> <nataliewenz(at)ebureau(dot)com> wrote:
>
>> ERROR: database is not accepting commands to avoid wraparound
>> data loss in database "template1"
>>
>> This error popped up right after I restarted a database that had
>> just finished running a vacuum freeze in single user mode for the
>> postgres database (the only database we use).
>> (That was in response to: ERROR: database is not accepting
>> commands to avoid wraparound data loss in database "postgres")
>>
>> I'm confused about the need to vacuum template1. All of our
>> activity is in the database named postgres. Do all of the
>> databases share the same set of xids?
>
> Yes.
>
>> If they do, why would the single user backend vacuum not handle
>> all of the databases at once?
>
> A connection is always to a particular database. The big question
> is why autovacuum didn't keep you out of trouble. You didn't turn
> that off, did you? If not, please describe your hardware and
> workload, and show you vacuum/autovacuum configuration values.

I can see why you would be wondering that. :) Autovacuum is definitely on. This situation might be considered "maintenance fallout". This database is very large (45 TB, currently, I believe), with most of the space being occupied by one table. (The database is basically a big transaction log, with a nearly all traffic being inserts. No deletes or updates on that big table.). I recently migrated it from 9.3 to 9.5, from one Free BSD host to another. As you can imagine, that took some time. I tried to reduce the time by "dumping" the large table in chunks using "copy" and a date range, thereby increasing the concurrency for both dumping and restoring that data. The problem is, I forgot that, even though I didn't use very many xids importing the data that way, it still meant that alllllll of that data would need to be frozen later. Once the data was all migrated and we resumed our regular traffic, we quickly got into trouble with the xids, since the auto vacuum couldn't freeze all 45 TB of that one big table while fielding up to... 500 million inserts per day? (typically around 6k inserts/second) So we disabled external connections to allow the database to vacuum, buffering the new transactions elsewhere. Now that the first enormous vacuum freeze is complete, we are "unspooling" the data that was buffered while the database vacuumed. Between the hardware and postgres software, we easily see this hit 65 thousand inserts per second. (Wooo, modern hardware! Excellent software!) Of course, that means that we've run out of xids again in the span of about 10 hours; no tuning of the auto vacuum that we've tried is able to keep pace with that. So. I'm currently suffering through a cycle of unspool buffered transactions, vacuum freeze while transactions buffer, unspool . . .

At this point, I have just left the auto vacuum defaults in place for this database, and it generally keeps up with our typical 6k insert/second workload. Even so, maintenance is problematic. We are looking into batch inserts in the future, or table partitioning (would that help with concurrency for maintenance? dumping, restoring, vacuuming?), or both. We also keep hoping that maybe, just maybe, we'll see a version of postgres with 64-bit xids? :D

Anyway, template1: this is expected behavior? I'm maybe on my third or fourth round of unspool, then single user vacuum freeze. The auto vacuum is on and free to run while I'm unspooling. I'm surprised that the auto vacuum wouldn't have already have vacuumed that very small database, or that haven't been prompted to vacuum template1 prior to this, if the only transactions in this database would have occurred during the initdb for the postgres database. Does it get used for other things during normal database operations?

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Kevin Grittner 2016-07-26 19:50:46 Re: Database 'template1' vacuum
Previous Message Kevin Grittner 2016-07-26 18:20:17 Re: Database 'template1' vacuum