Re: postgres invoked oom-killer

From: Lacey Powers <lacey(dot)powers(at)commandprompt(dot)com>
To: Silvio Brandani <silvio(dot)brandani(at)tech(dot)sdb(dot)it>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: postgres invoked oom-killer
Date: 2010-05-07 15:15:14
Message-ID: 4BE42E82.3070801@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Silvio Brandani wrote:
> We have a postgres 8.3.8 on linux
>
> We get following messages int /var/log/messages:
>
> May 6 22:31:01 pgblade02 kernel: postgres invoked oom-killer:
> gfp_mask=0x201d2, order=0, oomkilladj=0
> May 6 22:31:01 pgblade02 kernel:
> May 6 22:31:01 pgblade02 kernel: Call Trace:
> May 6 22:31:19 pgblade02 kernel: [<ffffffff800bed05>]
> out_of_memory+0x8e/0x2f5
> May 6 22:31:19 pgblade02 kernel: [<ffffffff8000f071>]
> __alloc_pages+0x22b/0x2b4
> May 6 22:31:19 pgblade02 kernel: [<ffffffff80012720>]
> __do_page_cache_readahead+0x95/0x1d9
> May 6 22:31:19 pgblade02 kernel: [<ffffffff800618e1>]
> __wait_on_bit_lock+0x5b/0x66
> May 6 22:31:19 pgblade02 kernel: [<ffffffff881fdc61>]
> :dm_mod:dm_any_congested+0x38/0x3f
> May 6 22:31:19 pgblade02 kernel: [<ffffffff800130ab>]
> filemap_nopage+0x148/0x322
> May 6 22:31:19 pgblade02 kernel: [<ffffffff800087ed>]
> __handle_mm_fault+0x1f8/0xdf4
> May 6 22:31:19 pgblade02 kernel: [<ffffffff80064a6a>]
> do_page_fault+0x4b8/0x81d
> May 6 22:31:19 pgblade02 kernel: [<ffffffff80060f29>]
> thread_return+0x0/0xeb
> May 6 22:31:19 pgblade02 kernel: [<ffffffff8005bde9>]
> error_exit+0x0/0x84
> May 6 22:31:27 pgblade02 kernel:
> May 6 22:31:28 pgblade02 kernel: Mem-info:
> May 6 22:31:28 pgblade02 kernel: Node 0 DMA per-cpu:
> May 6 22:31:28 pgblade02 kernel: cpu 0 hot: high 0, batch 1 used:0
> May 6 22:31:28 pgblade02 kernel: cpu 0 cold: high 0, batch 1 used:0
> May 6 22:31:28 pgblade02 kernel: cpu 1 hot: high 0, batch 1 used:0
> May 6 22:31:28 pgblade02 kernel: cpu 1 cold: high 0, batch 1 used:0
> May 6 22:31:28 pgblade02 kernel: cpu 2 hot: high 0, batch 1 used:0
> May 6 22:31:28 pgblade02 kernel: cpu 2 cold: high 0, batch 1 used:0
> May 6 22:31:28 pgblade02 kernel: cpu 3 hot: high 0, batch 1 used:0
> May 6 22:31:28 pgblade02 kernel: cpu 3 cold: high 0, batch 1 used:0
> May 6 22:31:28 pgblade02 kernel: Node 0 DMA32 per-cpu:
> May 6 22:31:28 pgblade02 kernel: cpu 0 hot: high 186, batch 31 used:27
> May 6 22:31:29 pgblade02 kernel: cpu 0 cold: high 62, batch 15 used:54
> May 6 22:31:29 pgblade02 kernel: cpu 1 hot: high 186, batch 31 used:23
> May 6 22:31:29 pgblade02 kernel: cpu 1 cold: high 62, batch 15 used:49
> May 6 22:31:29 pgblade02 kernel: cpu 2 hot: high 186, batch 31 used:12
> May 6 22:31:29 pgblade02 kernel: cpu 2 cold: high 62, batch 15 used:14
> May 6 22:31:29 pgblade02 kernel: cpu 3 hot: high 186, batch 31 used:50
> May 6 22:31:29 pgblade02 kernel: cpu 3 cold: high 62, batch 15 used:60
> May 6 22:31:29 pgblade02 kernel: Node 0 Normal per-cpu:
> May 6 22:31:29 pgblade02 kernel: cpu 0 hot: high 186, batch 31 used:5
> May 6 22:31:29 pgblade02 kernel: cpu 0 cold: high 62, batch 15 used:48
> May 6 22:31:29 pgblade02 kernel: cpu 1 hot: high 186, batch 31 used:11
> May 6 22:31:29 pgblade02 kernel: cpu 1 cold: high 62, batch 15 used:39
> May 6 22:31:29 pgblade02 kernel: cpu 2 hot: high 186, batch 31 used:14
> May 6 22:31:29 pgblade02 kernel: cpu 2 cold: high 62, batch 15 used:57
> May 6 22:31:29 pgblade02 kernel: cpu 3 hot: high 186, batch 31 used:94
> May 6 22:31:29 pgblade02 kernel: cpu 3 cold: high 62, batch 15 used:36
> May 6 22:31:29 pgblade02 kernel: Node 0 HighMem per-cpu: empty
> May 6 22:31:29 pgblade02 kernel: Free pages: 41788kB (0kB HighMem)
> May 6 22:31:29 pgblade02 kernel: Active:974250 inactive:920579
> dirty:0 writeback:0 unstable:0 free:10447 slab:11470 mapped-file:985
> mapped-anon:1848625 pagetables:111027
> May 6 22:31:29 pgblade02 kernel: Node 0 DMA free:11172kB min:12kB
> low:12kB high:16kB active:0kB inactive:0kB present:10816kB
> pages_scanned:0 all_unreclaimable? yes
> May 6 22:31:29 pgblade02 kernel: lowmem_reserve[]: 0 3254 8052 8052
> May 6 22:31:29 pgblade02 kernel: Node 0 DMA32 free:23804kB min:4636kB
> low:5792kB high:6952kB active:1555260kB inactive:1566144kB
> present:3332668kB pages_scanned:35703257 all_unreclaimable? yes
> May 6 22:31:29 pgblade02 kernel: lowmem_reserve[]: 0 0 4797 4797
> May 6 22:31:29 pgblade02 kernel: Node 0 Normal free:6812kB min:6836kB
> low:8544kB high:10252kB active:2342332kB inactive:2115836kB
> present:4912640kB pages_scanned:10165709 all_unreclaimable? yes
> May 6 22:31:29 pgblade02 kernel: lowmem_reserve[]: 0 0 0 0
> May 6 22:31:29 pgblade02 kernel: Node 0 HighMem free:0kB min:128kB
> low:128kB high:128kB active:0kB inactive:0kB present:0kB
> pages_scanned:0 all_unreclaimable? no
> May 6 22:31:29 pgblade02 kernel: lowmem_reserve[]: 0 0 0 0
> May 6 22:31:29 pgblade02 kernel: Node 0 DMA: 3*4kB 5*8kB 3*16kB
> 6*32kB 4*64kB 3*128kB 0*256kB 0*512kB 2*1024kB 0*2048kB 2*4096kB =
> 11172kB
> May 6 22:31:29 pgblade02 kernel: Node 0 DMA32: 27*4kB 0*8kB 1*16kB
> 0*32kB 2*64kB 4*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 5*4096kB =
> 23804kB
> May 6 22:31:29 pgblade02 ker
> if it asks for more memory than is actually available.
> nel: Node 0 Normal: 21*4kB 9*8kB 26*16kB 3*32kB 6*64kB 5*128kB 0*256kB
> 0*512kB 1*1024kB 0*2048kB 1*4096kB = 6812kB
> May 6 22:31:29 pgblade02 kernel: Node 0 HighMem: empty
> May 6 22:31:29 pgblade02 kernel: Swap cache: add 71286821, delete
> 71287152, find 207780333/216904318, race 1387+10506
> May 6 22:31:29 pgblade02 kernel: Free swap = 0kB
> May 6 22:31:30 pgblade02 kernel: Total swap = 8388600kB
> May 6 22:31:30 pgblade02 kernel: Free swap: 0kB
> May 6 22:31:30 pgblade02 kernel: 2293759 pages of RAM
> May 6 22:31:30 pgblade02 kernel: 249523 reserved pages
> May 6 22:31:30 pgblade02 kernel: 56111 pages shared
> May 6 22:31:30 pgblade02 kernel: 260 pages swap cached
> May 6 22:31:30 pgblade02 kernel: Out of memory: Killed process 29076
> (postgres).
>
>
> We get folloowing errors in the postgres log:
>
> A couple of time:
> 2010-05-06 22:26:28 CEST [23001]: [2-1] WARNING: worker took too long
> to start; cancelled
> Then:
> 2010-05-06 22:31:21 CEST [29059]: [27-1] LOG: system logger process
> (PID 29076) was terminated by signal 9: Killed
> Finally:
> 2010-05-06 22:50:20 CEST [29059]: [28-1] LOG: background writer
> process (PID 22999) was terminated by signal 9: Killed
> 2010-05-06 22:50:20 CEST [29059]: [29-1] LOG: terminating any other
> active server processes
>
> Any help higly apprecaited,
>
> ---
>
>
>
>
>
>
> Utilizziamo i dati personali che la riguardano esclusivamente per
> nostre finalità amministrative e contabili, anche quando li
> comunichiamo a terzi. Informazioni dettagliate, anche in ordine al Suo
> diritto di accesso e agli altri Suoi diritti, sono riportate alla
> pagina http://www.savinodelbene.com/news/privacy.html
> Se avete ricevuto questo messaggio per errore Vi preghiamo di
> ritornarlo al mittente eliminandolo assieme agli eventuali allegati,
> ai sensi art. 616 codice penale
> http://www.savinodelbene.com/codice_penale_616.html
> L'Azienda non si assume alcuna responsabilità giuridica qualora
> pervengano da questo indirizzo messaggi estranei all'attività
> lavorativa o contrari a norme.
> --
>

Hello Silvio,

Is this machine dedicated to PostgreSQL?

If so, I'd recommend adding these two parameters to your sysctl.conf

vm.overcommit_memory = 2
vm.overcommit_ratio = 0

So that OOMKiller is turned off.

PostgreSQL should gracefully degrade if a malloc() fails because it asks for too much memory.

Hope that helps. =)

Regards,

Lacey

--
Lacey Powers

The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 ext 104
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Kevin Grittner 2010-05-07 15:21:46 Re: ERROR: XX001 (Critical and Urgent)
Previous Message Siddharth Shah 2010-05-07 15:09:44 Re: ERROR: XX001 (Critical and Urgent)