Re: Table Vacuum Taking a Long Time

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Eliot Gable <egable+pgsql-general(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Table Vacuum Taking a Long Time
Date: 2014-04-02 14:44:57
Message-ID: 20140402144457.GA22921@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Eliot Gable wrote:
> I have a table which is about 12 GB in size. It has had a vacuum full
> analyze run on it, and then immediately after, I run vacuum analyze and it
> takes about 90 seconds to complete. If I continue to run vacuum analyze on
> it, it continues to take about 90 seconds each time. This system has a
> single 7200 RPM drive in it, so it's not a very fast drive. I was under the
> impression that if nothing had been done to the table since it was last
> vacuumed, that it would return immediately. Further, this is an append-only
> table, so why should it need to be vacuumed at all? We ran into cases where
> after writing to it long enough, the PGSQL autovacuum process would kick in
> and force a vacuum saying something about preventing wrap around. I don't
> understand why it would do this if it is append-only and we are using
> 64-bit sequences as IDs without OIDs turned on. What would be wrapping
> around without a vacuum? We tried to mitigate this by manually running
> vacuum programmatically, but then we end up using all the disk IO just
> running vacuum all the time, because it is constantly running through the
> entire table even though very little (if anything) has been done to it
> since the last vacuum.

Vacuuming a table requires first scanning the heap, then scanning each
indexes. The heap scan can be optimized using the visibility map: pages
can be skipped if they are marked all-visible. Vacuum will mark pages
all-visible, so a second vacuum immediately thereafter would need to
read few pages (and probably modify even fewer). But the index scans
cannot be optimized in that fashion; indexes must be scanned completely
each time. I would bet that that's where the time is going. Also note
that not all pages can be marked all-visible, and that pages are only
skipped if there are enough of them consecutive that it's worth when
considering disk readahead done by the operating system.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andrew Sullivan 2014-04-02 14:46:40 Re: Insert zero to auto increment serial column
Previous Message François Beausoleil 2014-04-02 14:44:49 Re: Table Vacuum Taking a Long Time