Re: OOM killer and rocky linux

From: kaido vaikla <kaido(dot)vaikla(at)gmail(dot)com>
To: Matti Linnanvuori <matti(dot)linnanvuori(at)portalify(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: OOM killer and rocky linux
Date: 2023-11-09 15:02:40
Message-ID: CA+427g_0RV43B9ftVCrLZkpZr0++MrV+DHXaU58_Dqf+5y_MzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Yep, i know it. But my question was more like
why (during OOM)
pg on o/s Red Hat Enterprise Linux Server release 7.9 (Maipo)
does "Instance is reinitializing"
but
pg on o/s Rocky Linux release 9.2 (Blue Onyx)
got "Instance received fast shutdown request. "

br
Kaido

On Thu, 9 Nov 2023 at 15:04, Matti Linnanvuori <
matti(dot)linnanvuori(at)portalify(dot)com> wrote:

> [image: elephant.png]
>
> 19.4. Managing Kernel Resources
> <https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT>
> postgresql.org
> <https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT>
>
> <https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT>
>
> Hello!
>
> Disabling overcommit is recommended.
>
> Best regards
>
> kaido vaikla <kaido(dot)vaikla(at)gmail(dot)com> kirjoitti 9.11.2023 kello 14.57:
>
> Hi,
>
> The problem is probably somwhere between pg and linux.
> Seems like OOM killer on rocky linux behaves in a different way
> than on RHEL.
>
> For OOM killer invoking I used:
> set work_mem=50GB;
> select * from generate_series(1,100000000) order by random();
>
> pg runs as systemd service.
>
> Example 1:
> ==========
> pg 15.2
> o/s Red Hat Enterprise Linux Server release 7.9 (Maipo)
>
> pg log:
>
> 2023-07-05 12:56:28.848 EEST [:::119279:postmaster] LOG: server process
> (PID 120401) was terminated by signal 9: Killed
> 2023-07-05 12:56:28.848 EEST [:::119279:postmaster] LOG: terminating any
> other active server processes
> 2023-07-05 12:56:28.850 EEST [:::119279:postmaster] LOG: all server
> processes terminated; reinitializing
> 2023-07-05 12:56:28.885 EEST [:::120421:startup] LOG: database system was
> interrupted; last known up at 2023-07-05 12:55:34 EEST
> 2023-07-05 12:56:28.899 EEST [:::120421:startup] LOG: database system was
> not properly shut down; automatic recovery in progress
> 2023-07-05 12:56:28.904 EEST [:::120421:startup] LOG: redo starts at
> 1/D0002E8
> 2023-07-05 12:56:28.904 EEST [:::120421:startup] LOG: invalid record
> length at 1/D000320: wanted 24, got 0
> 2023-07-05 12:56:28.904 EEST [:::120421:startup] LOG: redo done at
> 1/D0002E8 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> 2023-07-05 12:56:28.912 EEST [:::120422:checkpointer] LOG: checkpoint
> starting: end-of-recovery immediate wait
> 2023-07-05 12:56:28.918 EEST [:::120422:checkpointer] LOG: checkpoint
> complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0
> recycled; write=0.002 s, sync=0.002 s, total=0.013 s; sync files=2,
> longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB
>
>
> Instance is reinitializing
>
> Example 2:
> ==========
> pg 15.2
> o/s Rocky Linux release 9.2 (Blue Onyx)
>
> pg log and systemd log in chronological order (juuni == 06 ):
>
> juuni 30 14:19:11.491833 pgdb-forecast kernel:
> oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/postgresql-15.service,task=postmaster,pid=2575533,uid=26
> juuni 30 14:19:11.491839 pgdb-forecast kernel: Out of memory: Killed
> process 2575533 (postmaster) total-vm:9716200kB, anon-rss:5113080kB,
> file-rss:0kB, shmem-rss:16828kB, UID:26 pgtables:10256kB oom_score_adj:0
> juuni 30 14:19:11.491846 pgdb-forecast kernel: oom_reaper: reaped process
> 2575533 (postmaster), now anon-rss:0kB, file-rss:0kB, shmem-rss:16828kB
> juuni 30 14:19:11.478410 pgdb-forecast systemd[1]: postgresql-15.service:
> A process of this unit has been killed by the OOM killer.
> 2023-06-30 14:19:11.505 EEST [:::2517202:postmaster] LOG: received fast
> shutdown request
> 2023-06-30 14:19:11.518 EEST [:::2517202:postmaster] LOG: aborting any
> active transactions
> 2023-06-30 14:19:11.564 EEST [:::2517202:postmaster] LOG: server process
> (PID 2575533) was terminated by signal 9: Killed
> 2023-06-30 14:19:11.564 EEST [:::2517202:postmaster] LOG: terminating any
> other active server processes
> 2023-06-30 14:19:11.633 EEST [:::2517202:postmaster] LOG: abnormal
> database system shutdown
> 2023-06-30 14:19:11.939 EEST [:::2517202:postmaster] LOG: database system
> is shut down
> juuni 30 14:19:11.940216 pgdb-forecast systemd[1]: postgresql-15.service:
> Main process exited, code=exited, status=1/FAILURE
> juuni 30 14:19:11.940255 pgdb-forecast systemd[1]: postgresql-15.service:
> Killing process 2517203 (postmaster) with signal SIGKILL.
> juuni 30 14:19:11.940700 pgdb-forecast systemd[1]: postgresql-15.service:
> Failed with result 'oom-kill'.
> juuni 30 14:19:11.940884 pgdb-forecast systemd[1]: postgresql-15.service:
> Consumed 46min 16.612s CPU time.
>
>
> Instance received fast shutdown request.
> And kernel sends kill twice: process 2575533, process 2517203.
>
> Googeling this problem gives me nothing, and i'm not sure,
> is it caused of pg or linux kernel.
>
> br
> Kaido
>
>
>
>
>
>
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message B Liu 2023-11-10 08:43:21 Re: pgagent-pg16 could not start: Error 1053
Previous Message Erik Serrano 2023-11-09 14:34:13 Re: pgagent-pg16 could not start: Error 1053