From: | "Steven Flatt" <steven(dot)flatt(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Database-wide VACUUM ANALYZE |
Date: | 2007-06-21 17:09:57 |
Message-ID: | 357fa7590706211009w2ef6071fn83c5759db6be1e80@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
We recently upgraded a very large database (~550 GB) from 8.1.4 to 8.2.4 via
a pg_dump and pg_restore. (Note that the restore took several days.) We
had accepted the default settings:
vacuum_freeze_min_age = 100 million
autovacuum_freeze_max_age = 200 million
Due to our very high transaction rate, it appears that a database-wide
vacuum kicked off approximately 2 weeks after the restore. (Aside: after
reading the docs and considering our system characteristics, I know now that
our autovacuum_freeze_max_age should be more like 2 billion. However on
this machine I haven't changed the config settings yet.) Also, I believe,
that due to the bulk of our data having the same "age" after the restore,
the db-wide vacuum had *a lot* of rows to mark with the FrozenXID.
The good thing is that the db-wide vacuum, which ran for a long time, was
reasonably non-intrusive to other database activity (somewhat, but
reasonable for the short term). The other good thing was that concurrent
autovacuum processes were still vacuuming/analyzing tables as necessary.
The bad thing, which I don't totally understand from reading the docs, is
that another db-wide vacuum kicked off exactly 24 hours after the first
db-wide vacuum kicked off, before the first one had finished. (Note that
these vacuums seem to go through the tables alphabetically.) I managed to
explain this to myself in that there were still rows in tables not yet
touched by the first db-wide vacuum that could have XIDs older than
autovacuum_freeze_max_age. Fine, so two db-wide vacuums were now taking
place, one behind the other.
The first db-wide vacuum finished approximately 36 hours after it started.
At this point I was convinced that the second db-wide vacuum would run to
completion with little or no work to do and all would be good. The thing I
can't explain is why a third db-wide vacuum kicked off exactly 24 hours
(again) after the second db-wide vacuum kicked off (and the second vacuum
still running).
Wouldn't the first db-wide vacuum have marked any rows that needed it with
the FrozenXID? Why would a third db-wide vacuum kick off so soon after the
first db-wide vacuum had completed? Surely there haven't been 100 million
more transactions in the last two days?
Can someone explain what is going on here? I can't quite figure it out
based on the docs.
Thanks,
Steve
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Sullivan | 2007-06-21 17:11:48 | Re: Performance query about large tables, lots of concurrent access |
Previous Message | Sabin Coanda | 2007-06-21 16:53:54 | vacuum a lot of data when insert only |