Re: Vacuuming strategy

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Elanchezhiyan Elango <elanelango(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Vacuuming strategy
Date: 2014-04-30 15:36:08
Message-ID: CAMkU=1wOnB1=8bJcLGBfqThSxFrrdQdx16OwEi8w4jXbYgpfwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Apr 29, 2014 at 4:59 PM, Elanchezhiyan Elango
<elanelango(at)gmail(dot)com>wrote:

> Hi,
>
> I need help on deciding my vacuuming strategy. I need to know if I ever
> need to do 'vacuum full' for my tables.
>
> Tables1: Following is the query patterns on 4 high traffic table in my
> database:
> 1. Every 5 minutes about 50000 rows in the table are updated. And for a
> given clock hour the same 50000 rows are updated again and again.
> 2. When a new hour begins, another 50000 rows get inserted and they get
> updated every 5 minutes.
> 3. Every night a days worth of old stats are deleted. So this would be 24
> * 50000 = 1.2M records that get deleted every night.
>

What indexes exist? Are the updates to indexed columns? Which day's
tuples are deleted every night? Is it the day a week ago, or a month ago,
or something else?

> Tables3: Another 4 tables have the following pattern:
> 1. Every 1 hour 50000 rows get updated. For a given day the same 50000
> rows are updated again and again.
> 2. When a new day begins, another 50000 rows get inserted and they get
> updated every hour.
> 3. Every night 1.2M records get deleted.
>

If only 50,000 get inserted daily and 1.2M get deleted, this table will
soon be empty! I think you said daily when you meant hourly somewhere in
there.

>
> Tables4: Another 39 tables have the following pattern:
> 1. Every 1 hour 2000 rows gets updated. For a given day the same 2000 rows
> are updated again and again.
> 2. When a new day begins, another 2000 rows get inserted and they get
> updated every hour.
> 3. Every night 48000 rows get deleted.
>
> With the above query pattern with intensive updates and deletes, I need to
> do some aggressive vacuuming.
>
> Current strategy:I am running with default autovacuum settings (postgres
> 9.1.9) and I tried doing a 'vacuum full' for the 8 high traffic tables
> (Tables1 and Tables3) every night. But after a point, the 'vacuum full's
> started timing out (with 4min timeout) every night.
>

Why do you have a 4 minute timeout? That seems counter-productive.

> I think this is because the table is growing bigger (~5GB) and doing a
> vacuum full every night is probably not feasible.
>

It is probably not necessary, but it certainly seems feasible. 4 min * 8
tables = 32 minutes. Call it one hour, since the 4 minute timeout has
started not being enough. Is the 5GB for the table plus indexes, or just
the table itself?

>
> Going with the default autovacuum settings and not doing 'vacuum full' at
> all is also not enough for my usecase. Whenever vacuum full succeeded every
> night, it did seem to reclaim a considerable amount of space.
>

You delete a bunch of tuples every night, so of course a vacuum full after
that is going to return a lot of space. But that space is probably just
going to be needed again the next day. If you don't do the vacuum full,
does the *peak* space keep increasing, or does it stabilize?

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Raymond O'Donnell 2014-04-30 16:27:42 Re: Escape double-quotes in text[]?
Previous Message Michael Bostock 2014-04-30 14:18:58 Re: Ingres to Postgres migration