Re: autovacuum suggestions for 500,000,000+ row tables?

From: Alex Stapleton <alexs(at)advfn(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: autovacuum suggestions for 500,000,000+ row tables?
Date: 2005-06-21 10:04:45
Message-ID: 7E161634-29A0-4DBC-A303-3675D705A965@advfn.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 20 Jun 2005, at 18:46, Josh Berkus wrote:

> Alex,
>
>
>
>> Hi, i'm trying to optimise our autovacuum configuration so that it
>> vacuums / analyzes some of our larger tables better. It has been set
>> to the default settings for quite some time. We never delete
>> anything (well not often, and not much) from the tables, so I am not
>> so worried about the VACUUM status, but I am wary of XID wraparound
>> nuking us at some point if we don't sort vacuuming out so we VACUUM
>> at least once every year ;)
>>
>>
>
> I personally don't use autovaccuum on very large databases. For DW,
> vacuuming is far better tied to ETL operations or a clock schedule of
> downtime.
>

Downtime is something I'd rather avoid if possible. Do you think we
will need to run VACUUM FULL occasionally? I'd rather not lock tables
up unless I cant avoid it. We can probably squeeze an automated
vacuum tied to our data inserters every now and then though.

> XID wraparound may be further away than you think. Try checking
> pg_controldata, which will give you the current XID, and you can
> calculate
> how long you are away from wraparound. I just tested a 200G data
> warehouse
> and figured out that we are 800 months away from wraparound,
> despite hourly
> ETL.
>

Is this an 8.0 thing? I don't have a pg_controldata from what I can
see. Thats nice to hear though.

>
>
>
>> However not running ANALYZE for such huge
>> periods of time is probably impacting the statistics accuracy
>> somewhat, and I have seen some unusually slow queries at times.
>> Anyway, does anyone think we might benefit from a more aggressive
>> autovacuum configuration?
>>
>>
>
> Hmmm, good point, you could use autovacuum for ANALYZE only. Just
> set the
> VACUUM settings preposterously high (like 10x) so it never runs.
> Then it'll
> run ANALYZE only. I generally threshold 200, multiple 0.1x for
> analyze;
> that is, re-analyze after 200+10% of rows have changed.
>

I will try those settings out, that sounds good to me though.

> --
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Amit V Shah 2005-06-21 14:01:21 Do Views execute underlying query everytime ??
Previous Message Rohit Gaddi 2005-06-21 07:48:50 index selection by query planner