Failing to allocate memory when I think it shouldn't

From: Thomas Ziegler <thomas(dot)ziegler(at)holmsecurity(dot)com>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Failing to allocate memory when I think it shouldn't
Date: 2024-09-13 05:22:47
Message-ID: PAWPR07MB1009548159C0046E01170C6E6ED652@PAWPR07MB10095.eurprd07.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello Community!

I have had my database killed by the kernel oom-killer. After that I set turned off memory over-committing and that is where things got weird.

I have a PostgreSQL 16.2 running on a Debian Bookworm. The machine has 64GB of RAM and runs in a VM in an OpenStack environment.
I have `shared_buffers` at `16000MB`, `work_mem` at `80MB`, `temp_buffers` at `8MB`, `max_connections` at `300` and `maintenance_work_mem` at `1GB`.
So all in all, I get to roughly 42GB of max memory usage (`16000+(80+8)*300=42400`).

But now, I have connections fail because they can't allocate enough memory. I thought they should start creating temporary files if that is the case (and I tested this with trivial examples in psql) but the connections from my applications sometimes fail. I suspect that the queries are so large, that the execution-plans etc. are too large to fit into the 80MB of RAM. This is the "simpler" one of my problems, but it would be nice if somebody could tell me if my suspicion makes sense.

The harder issue is, that in multiple instances those memory-issues actually lead to the database "crashing" and having to recover from the redo-log. To be precise, the master-process doesn't crash and the database automatically starts the recovery (in contrast to the oom-kill, where I had to restart the database manually).
I can't get to the bottom of why that happens. At the point in time where that happens, there was more than enough usable memory on the system (>17GB) and I don't understand how it can kill the whole database if a worker-process breaks because it can't get more RAM. Also why wasn't it able to get more RAM, the kernel should have been more than happy to give it more as there was so much usable RAM available. (By "usable" I mean free + cache - shared)

Here are logs of one of those instances:

TopMemoryContext: 166384 total in 7 blocks; 33128 free (10 chunks); 133256 used
smgr relation table: 16384 total in 2 blocks; 4640 free (2 chunks); 11744 used
PgStat Shared Ref Hash: 7216 total in 2 blocks; 688 free (0 chunks); 6528 used
PgStat Shared Ref: 1024 total in 1 blocks; 328 free (0 chunks); 696 used
PgStat Pending: 4096 total in 3 blocks; 2168 free (6 chunks); 1928 used
TopTransactionContext: 8192 total in 1 blocks; 4536 free (9 chunks); 3656 used
TransactionAbortContext: 32768 total in 1 blocks; 32504 free (0 chunks); 264 used
Portal hash: 8192 total in 1 blocks; 592 free (0 chunks); 7600 used
TopPortalContext: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Relcache by OID: 16384 total in 2 blocks; 7712 free (2 chunks); 8672 used
CacheMemoryContext: 262144 total in 6 blocks; 160 free (2 chunks); 261984 used
index info: 2048 total in 2 blocks; 760 free (2 chunks); 1288 used: pg_rewrite_rel_rulename_index
[ ... ]
WAL record construction: 50200 total in 2 blocks; 6376 free (0 chunks); 43824 used
PrivateRefCount: 8192 total in 1 blocks; 2648 free (0 chunks); 5544 used
MdSmgr: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
LOCALLOCK hash: 8192 total in 1 blocks; 592 free (0 chunks); 7600 used
GUCMemoryContext: 24576 total in 2 blocks; 5288 free (5 chunks); 19288 used
GUC hash table: 32768 total in 3 blocks; 12704 free (5 chunks); 20064 used
Timezones: 104112 total in 2 blocks; 2648 free (0 chunks); 101464 used
Postmaster: 8192 total in 1 blocks; 6496 free (10 chunks); 1696 used
ident parser context: 1024 total in 1 blocks; 760 free (0 chunks); 264 used
hba parser context: 17408 total in 5 blocks; 6944 free (7 chunks); 10464 used
ErrorContext: 8192 total in 1 blocks; 7928 free (6 chunks); 264 used
Grand total: 889064 bytes in 124 blocks; 180840 free (113 chunks); 708224 used
2024-09-12 05:18:36.064 UTC [3808080] _(at)_ FATAL: out of memory
2024-09-12 05:18:36.064 UTC [3808080] _(at)_ DETAIL: Failed on request of size 190 in memory context "CacheMemoryContext".
TopMemoryContext: 166384 total in 7 blocks; 33128 free (10 chunks); 133256 used
[ ... ]
2024-09-12 05:18:36.065 UTC [3808075] _(at)_ FATAL: out of memory
2024-09-12 05:18:36.065 UTC [3808075] _(at)_ DETAIL: Failed on request of size 190 in memory context "CacheMemoryContext".
2024-09-12 05:18:36.065 UTC [3808076] ERROR: out of memory
2024-09-12 05:18:36.065 UTC [3808076] DETAIL: Failed on request of size 32800 in memory context "HashBatchContext".
2024-09-12 05:18:36.065 UTC [3805724] _(at)_ ERROR: out of memory
2024-09-12 05:18:36.065 UTC [3805724] _(at)_ DETAIL: Failed on request of size 32800 in memory context "HashBatchContext".
2024-09-12 05:18:36.065 UTC [3805724] _(at)_ CONTEXT: parallel worker
2024-09-12 05:18:36.065 UTC [3808078] _(at)_ FATAL: out of memory
2024-09-12 05:18:36.065 UTC [3808078] _(at)_ DETAIL: Failed on request of size 190 in memory context "CacheMemoryContext".
2024-09-12 05:18:36.065 UTC [3808081] _(at)_ FATAL: out of memory
2024-09-12 05:18:36.065 UTC [3808081] _(at)_ DETAIL: Failed on request of size 40992 in memory context "TopMemoryContext".
2024-09-12 05:18:36.066 UTC [1932776] LOG: could not fork new process for connection: Cannot allocate memory
2024-09-12 05:18:36.066 UTC [3808050] _(at)_ ERROR: out of memory
2024-09-12 05:18:36.066 UTC [3808050] _(at)_ DETAIL: Failed while creating memory context "HashAgg meta context".
2024-09-12 05:18:36.067 UTC [3488999] _(at)_ ERROR: out of memory
2024-09-12 05:18:36.067 UTC [3488999] _(at)_ DETAIL: Failed on request of size 32800 in memory context "HashBatchContext".
2024-09-12 05:18:36.073 UTC [1932776] LOG: background worker "parallel worker" (PID 3808076) exited with exit code 1
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
2024-09-12 05:18:36.083 UTC [1932776] LOG: background worker "parallel worker" (PID 3808077) was terminated by signal 6: Aborted
2024-09-12 05:18:36.083 UTC [1932776] LOG: terminating any other active server processes
2024-09-12 05:18:36.829 UTC [1932776] LOG: all server processes terminated; reinitializing
2024-09-12 05:18:39.731 UTC [3808861] LOG: database system was interrupted; last known up at 2024-09-12 05:08:52 UTC
2024-09-12 05:18:40.277 UTC [3808861] LOG: database system was not properly shut down; automatic recovery in progress
2024-09-12 05:18:40.303 UTC [3808861] LOG: redo starts at 3EA1/3AB0CF70
2024-09-12 05:18:50.304 UTC [3808861] LOG: redo in progress, elapsed time: 10.00 s, current LSN: 3EA1/837DA848
2024-09-12 05:18:55.612 UTC [3808861] LOG: invalid record length at 3EA1/C42A4A48: expected at least 24, got 0
2024-09-12 05:18:55.612 UTC [3808861] LOG: redo done at 3EA1/C42A4A20 system usage: CPU: user: 9.04 s, system: 3.49 s, elapsed: 15.31 s
2024-09-12 05:18:55.642 UTC [3808862] LOG: checkpoint starting: end-of-recovery immediate wait
2024-09-12 05:19:16.215 UTC [3808862] LOG: checkpoint complete: wrote 294597 buffers (14.4%); 0 WAL file(s) added, 0 removed, 105 recycled; [ ... ]
2024-09-12 05:19:16.340 UTC [1932776] LOG: database system is ready to accept connections

I would be eternally grateful if somebody could explain to me what is going on there. As I wrote earlier, there should have been more than enough RAM available on the system, so I don't see any reason why the kernel would deny any memory allocations.
The only related item I was able to find is this thread from 2018: https://www.postgresql.org/message-id/flat/e47eec7f-9dbc-cbe3-dd01-32cbee90f332%40neteven.com

Cheers,
Thomas

Responses

Browse pgsql-general by date

  From Date Subject
Next Message shammat 2024-09-13 06:13:25 Re: Functionally dependent columns in SELECT DISTINCT
Previous Message Tony Shelver 2024-09-13 05:20:24 Re: DDL issue