Re: Autovacuum running out of memory

From: Alexis Lê-Quôc <alq(at)datadoghq(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Autovacuum running out of memory
Date: 2011-08-10 18:47:30
Message-ID: CAAGz8TPX8O7e=r8ZsLEE2ABW6yv9gZreXqGeDydJgeQF2oCf_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Aug 10, 2011 at 1:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alexis Le-Quoc <alq(at)datadoghq(dot)com> writes:
>> I've been hitting a "out of memory error" during autovacuum of
>> relatively large tables (compared to the amount of RAM available).
>
>> The error message is:
>> [10236]: [1-1] user=,db=,remote= ERROR:  out of memory
>> [10236]: [2-1] user=,db=,remote= DETAIL:  Failed on request of size 395973594.
>> [10236]: [3-1] user=,db=,remote= CONTEXT:  automatic vacuum of table
>> "***.public.serialized_series"
>
>> --- postgresql.conf (subset) ----
>> shared_buffers = 1971421kB
>> work_mem = 9857kB
>> maintenance_work_mem = 752MB
>
> Since the memory map shows that not very much memory has been allocated
> by VACUUM yet, I suspect it's failing while trying to create the work
> array for remembering dead tuple TIDs.  It will assume that it can use
> up to maintenance_work_mem for that.  (The fact that it didn't ask for
> the whole 752MB probably means this is a relatively small table in
> which there couldn't possibly be that many TIDs.)  So the short answer
> is "reduce maintenance_work_mem to something under 300MB".
>
> However, I find it a bit odd that you're getting this failure in what
> appears to be a 64-bit build.  That means you're not running out of
> address space, so you must actually be out of RAM+swap.  Does the
> machine have only 4GB or so of RAM?  If so, that value for
> shared_buffers is unrealistically large; it's not leaving enough RAM for
> other purposes such as this.

The box has little under 8GB (it's on EC2, a "m1.large" instance)

total used free shared buffers cached
Mem: 7700 6662 1038 0 25 6078
-/+ buffers/cache: 558 7142
Swap: 0 0 0

There is no swap.

> Where did you get the above-quoted parameter settings, anyway?  They
> seem a bit weird, as in written to many more decimal places than anyone
> could really expect to mean anything.

I have them computed by our configuration management system. Here's
the logic behind it (edited from ruby):

# Compute shared memory for procps
page_size = getconf PAGE_SIZE
phys_pages = getconf _PHYS_PAGES
shmall = phys_pages
shmmax = shmall * page_size

shared_buffers = kb_memory_total / 4
work_mem = (kb_memory_total / max_connections / 4)
maintenance_work_mem = (kb_memory_total * 100 / (1024 * 1024))

In turn they come from High-Performance Postgresql 9.0
(http://www.postgresql.org/about/news.1249)

Thanks,

--
Alexis Lê-Quôc

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2011-08-10 18:54:23 Re: Autovacuum running out of memory
Previous Message Kevin Grittner 2011-08-10 18:10:16 Re: poor pefrormance with regexp searches on large tables