From: | "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: DB running out of memory issues after upgrade |
Date: | 2020-02-23 10:19:28 |
Message-ID: | 20200223101928.GE12464@hjp.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgadmin-hackers pgadmin-support pgsql-admin pgsql-bugs pgsql-general pgsql-hackers pgsql-performance pgsql-pkg-yum |
On 2020-02-18 18:10:08 +0000, Nagaraj Raj wrote:
> Below are the same configurations ins .conf file before and after updagrade
>
> show max_connections; = 1743
[...]
> show work_mem = "4MB"
This is an interesting combination: So you expect a large number of
connections but each one should use very little RAM?
[...]
> here is some sys logs,
>
> 2020-02-16 21:01:17 UTC [-]The database process was killed by the OS
> due to excessive memory consumption.
> 2020-02-16 13:41:16 UTC [-]The database process was killed by the OS
> due to excessive memory consumption.
The oom-killer produces a huge block of messages which you can find with
dmesg or in your syslog. It looks something like this:
Feb 19 19:06:53 akran kernel: [3026711.344817] platzangst invoked oom-killer: gfp_mask=0x15080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), nodemask=(null), order=1, oom_score_adj=0
Feb 19 19:06:53 akran kernel: [3026711.344819] platzangst cpuset=/ mems_allowed=0-1
Feb 19 19:06:53 akran kernel: [3026711.344825] CPU: 7 PID: 2012 Comm: platzangst Tainted: G OE 4.15.0-74-generic #84-Ubuntu
Feb 19 19:06:53 akran kernel: [3026711.344826] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.7 06/16/2016
Feb 19 19:06:53 akran kernel: [3026711.344827] Call Trace:
Feb 19 19:06:53 akran kernel: [3026711.344835] dump_stack+0x6d/0x8e
Feb 19 19:06:53 akran kernel: [3026711.344839] dump_header+0x71/0x285
...
Feb 19 19:06:53 akran kernel: [3026711.344893] RIP: 0033:0x7f292d076b1c
Feb 19 19:06:53 akran kernel: [3026711.344894] RSP: 002b:00007fff187ef240 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
Feb 19 19:06:53 akran kernel: [3026711.344895] RAX: ffffffffffffffda RBX: 00007fff187ef240 RCX: 00007f292d076b1c
Feb 19 19:06:53 akran kernel: [3026711.344896] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
Feb 19 19:06:53 akran kernel: [3026711.344897] RBP: 00007fff187ef2b0 R08: 00007f292d596740 R09: 00000000009d43a0
Feb 19 19:06:53 akran kernel: [3026711.344897] R10: 00007f292d596a10 R11: 0000000000000246 R12: 0000000000000000
Feb 19 19:06:53 akran kernel: [3026711.344898] R13: 0000000000000020 R14: 0000000000000000 R15: 0000000000000000
Feb 19 19:06:53 akran kernel: [3026711.344899] Mem-Info:
Feb 19 19:06:53 akran kernel: [3026711.344905] active_anon:14862589 inactive_anon:1133875 isolated_anon:0
Feb 19 19:06:53 akran kernel: [3026711.344905] active_file:467 inactive_file:371 isolated_file:0
Feb 19 19:06:53 akran kernel: [3026711.344905] unevictable:0 dirty:3 writeback:0 unstable:0
...
Feb 19 19:06:53 akran kernel: [3026711.344985] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Feb 19 19:06:53 akran kernel: [3026711.344997] [ 823] 0 823 44909 0 106496 121 0 lvmetad
Feb 19 19:06:53 akran kernel: [3026711.344999] [ 1354] 0 1354 11901 3 135168 112 0 rpcbind
Feb 19 19:06:53 akran kernel: [3026711.345000] [ 1485] 0 1485 69911 99 180224 159 0 accounts-daemon
...
Feb 19 19:06:53 akran kernel: [3026711.345345] Out of memory: Kill process 25591 (postgres) score 697 or sacrifice child
Feb 19 19:06:53 akran kernel: [3026711.346563] Killed process 25591 (postgres) total-vm:71116948kB, anon-rss:52727552kB, file-rss:0kB, shmem-rss:3023196kB
The most interesting lines are usually the last two: In this case they
tell us that the process killed was a postgres process and it occupied
about 71 GB of virtual memory at that time. That was clearly the right
choice since the machine has only 64 GB of RAM. Sometimes it is less
clear and then you might want to scroll through the (usually long) list
of processes to see if there are other processes which need suspicious
amounts of RAM or maybe if there are just more of them than you would
expect.
> I identified one simple select which consuming more memory and here is the
> query plan,
>
>
>
> "Result (cost=0.00..94891854.11 rows=3160784900 width=288)"
> " -> Append (cost=0.00..47480080.61 rows=3160784900 width=288)"
> " -> Seq Scan on msghist (cost=0.00..15682777.12 rows=3129490000 width
> =288)"
> " Filter: (((data -> 'info'::text) ->> 'status'::text) =
> 'CLOSE'::text)"
> " -> Seq Scan on msghist msghist_1 (cost=0.00..189454.50 rows=31294900
> width=288)"
> " Filter: (((data -> 'info'::text) ->> 'status'::text) =
> 'CLOSE'::text)"
So: How much memory does that use? It produces a huge number of rows
(more than 3 billion) but it doesn't do much with them, so I wouldn't
expect the postgres process itself to use much memory. Are you sure its
the postgres process and not the application which uses a lot of memory?
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp(at)hjp(dot)at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
From | Date | Subject | |
---|---|---|---|
Next Message | Akshay Joshi | 2020-02-24 06:41:53 | pgAdmin 4 commit: Fix an issue where command and statements were parsed |
Previous Message | Aditya Toshniwal | 2020-02-21 09:39:25 | [pgAdmin][RM5115] RULE SQL incorrectly generated |
From | Date | Subject | |
---|---|---|---|
Next Message | Dave Page | 2020-02-24 09:23:43 | Re: Privacy question regarding the "Geometry Viewer" feature in pgAdmin 4. |
Previous Message | tutiluren | 2020-02-23 06:02:40 | Privacy question regarding the "Geometry Viewer" feature in pgAdmin 4. |
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2020-02-23 18:19:34 | Re: Reset DB stats suggestion pg_stat_reset() |
Previous Message | postggen2020 s | 2020-02-23 02:40:33 | Re: Reset DB stats suggestion pg_stat_reset() |
From | Date | Subject | |
---|---|---|---|
Next Message | Fan Liu | 2020-02-24 01:32:26 | RE: [Bus error] huge_pages default value (try) not fall back |
Previous Message | Tom Lane | 2020-02-23 06:43:26 | Re: BUG #16273: InitDB Memory leak |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter J. Holzer | 2020-02-23 11:21:48 | Re: Can I trigger an action from a coalesce ? |
Previous Message | Andrus | 2020-02-23 09:06:09 | Re: How to get error message details from libpq based psqlODBC driver (regression) |
From | Date | Subject | |
---|---|---|---|
Next Message | Dave Cramer | 2020-02-23 11:16:23 | Re: Error on failed COMMIT |
Previous Message | Michael Paquier | 2020-02-23 07:27:57 | Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions? |
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2020-02-23 10:46:09 | Re: PostgreSQL 11 higher Planning time on Partitioned table |
Previous Message | Justin Pryzby | 2020-02-23 10:12:09 | Re: PostgreSQL 11 higher Planning time on Partitioned table |
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2020-02-24 16:34:03 | Re: DB running out of memory issues after upgrade |
Previous Message | Nagaraj Raj | 2020-02-18 19:10:20 | Re: DB running out of memory issues after upgrade |