Re: AUTOVACUUM after purging a lot of records

From: Jeni Fifrick <jfifrick(at)incomm(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: AUTOVACUUM after purging a lot of records
Date: 2013-12-13 21:51:20
Message-ID: E59C57F2AE0F8641B5B428292D88479F26D5E384@ATLPXCH02MB.uss.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi Kevin,

Thank you very much for the input.

· I think, I'll let autovaccum run and check it tomorrow.

So, what you're saying is if I run the "VACUUM ANALYZE" while the autovacuum still running, the autovacuum will be cancelled, right? And, I need to set the maintenance_work_mem in the session before executing the 'VACUUM ANALYZE transactionlog'. Is this a correct syntax?

Regarding the configuration, all the memory related are still with default value.

We did change the max_connections to be 2000. BUT, so far, our max connection is around 500.

We’re planning to do upgrade early next year. Since this is holiday season (and we’re on business peak during this time), so management is really careful in making any changes, EXCEPT absolutely necessary (or break fix).

Thanks again,

Jeni

-----Original Message-----
From: Kevin Grittner [mailto:kgrittn(at)ymail(dot)com]
Sent: Friday, December 13, 2013 4:24 PM
To: Jeni Fifrick; pgsql-admin(at)postgresql(dot)org
Subject: Re: [ADMIN] AUTOVACUUM after purging a lot of records

Jeni Fifrick <jfifrick(at)incomm(dot)com<mailto:jfifrick(at)incomm(dot)com>> wrote:

> Is it normal for vacumm to run that long (23 hour)?

It can be. The autovacuum workers are "paced" so that their resource usage is limited. This spreads out the work, to minimize impact on foreground processes. In some cases the default configuration is not aggressive enough to keep up overall, but that should be what you look at, not whether a particular worker takes days on a particular table -- that in itself doesn't indicate a problem.

> Is there a way to verify whether the vacuum process is running

> properly and making any progress?

Use OS tools (like top or ps) to check whether the process is still consuming resources.

> Can I increase the maintenance_work_mem parameter to make the existing

> process faster?

That is one of many settings which are often tuned. If you really want to cause that table to finish quickly, you could just start a VACUUM ANALYZE command for that table. The autovacuum run will cancel to make room for it, and the command default to running full speed rather than pacing its work. You could set maintenance_work_mem or cost-based pacing for the command in the session before running the command to optimize it or pace it.

> Do you think it’s better to do VACUUM FULL, consider the amount of

> records deleted?

That depends on how much data is left and whether you can afford to have the table exclusively locked while its data is copied and re-indexed.

> Any other input/recommendation you can give us?

If you expect to purge sets of data like that, and you know what purge group a row belongs in at the time it is inserted, you should probably look at partitioning. There's no faster way to drop millions of rows from a table than to drop the partition they are in.

> We’re using Vpostgres version 9.1.3

If you want to stay with 9.1 for now, you should upgrade to the latest minor release.

http://www.postgresql.org/support/versioning/

> on CentOS 6.3. 12 cores and 64 GB memory.

> But, we’re using default setup.

With a machine that size, you definitely should tune several settings. The defaults are geared more toward being able to "install and go" on a low-end laptop. Serious servers need tuning.

If you really are at the default configuration (including max_connections), the following would almost certainly be

improvements:

shared_buffers = 8GB

effective_cache_size = 32GB

work_mem = 160MB

maintenance_work_mem = 2GB

wal_buffers = 32MB

checkpoint_completion_target = 0.9

checkpoint_segments = 64

autovacuum_vacuum_cost_limit = 800

cpu_tuple_cost = 0.03

Depending on the size of your active data set, you may also want to reduce random_page_cost; but you didn't provide enough information to know that. That's hardly a complete list, but it should help with performance on most servers and workloads with the RAM and cores you describe.

--

Kevin Grittner

EDB: http://www.enterprisedb.com

The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Campbell, Lance 2013-12-13 21:55:30 script to drop and build all indexes in database
Previous Message Kevin Grittner 2013-12-13 21:24:10 Re: AUTOVACUUM after purging a lot of records