Re: 12.2: Howto check memory-leak in worker?

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Peter <pmc(at)citylink(dot)dinoex(dot)sub(dot)org>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: 12.2: Howto check memory-leak in worker?
Date: 2020-05-04 16:14:29
Message-ID: CAFiTN-vfcqi_VD+0ZbPNBaU2+SsdXMJx6rcePZg5m--wpB=ABA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, May 4, 2020 at 5:43 PM Peter <pmc(at)citylink(dot)dinoex(dot)sub(dot)org> wrote:
>
> Hi all,
> I have something that looks a bit insane:
>
> # ps axl | grep 6145
> UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
> 770 6145 1 0 20 0 241756 868 select SsJ - 0:24.62 /usr/local/bin/postgres -D
> 770 6147 6145 0 23 0 243804 109784 select IsJ - 3:18.52 postgres: checkpointer (
> 770 6148 6145 0 20 0 241756 21348 select SsJ - 2:02.83 postgres: background writer
> 770 6149 6145 0 20 0 241756 7240 select SsJ - 16:36.80 postgres: walwriter (pos
> 770 6150 6145 0 20 0 21980 876 select SsJ - 0:13.92 postgres: archiver last w
> 770 6151 6145 0 20 0 21980 980 select SsJ - 0:58.45 postgres: stats collector
> 770 6152 6145 0 20 0 241756 1268 select IsJ - 0:02.07 postgres: logical replicati
> 770 43315 6145 0 21 0 251844 7520 select IsJ - 1:07.74 postgres: admin postgres 19
> 770 43317 6145 0 25 0 251764 8684 select IsJ - 1:28.89 postgres: admin bareos 192.
> 770 43596 6145 0 20 0 245620 4476 select IsJ - 0:00.12 postgres: admin bareos 192.
> 770 43761 6145 0 20 0 245620 4476 select IsJ - 0:00.15 postgres: admin bareos 192.
> 770 90206 6145 0 52 0 1331256 219720 racct DsJ - 563:45.41 postgres: bareos bareos 192
>
> The 90206 is continuously growing. It is the unspecific, all-purpose
> worker for the www.bareos.com backup tool, so it is a bit difficult to
> figure what precisely it does - but it tends to be rather simple
> straight-forward queries, so it is unlikely to have dozens of "internal sort
> operations and hash tables".
>
> What I can say that at times this worker is completely idle in
> ClientRead, but does not shrink in memory. Is this a normal behaviour?
>
> Here is a more dynamic picture: it continues to add 2048kB chunks (and
> does not do noticeable paging):
>
> UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
> Mon May 4 13:33:09 CEST 2020
> 770 90206 6145 0 91 0 1335352 226900 - RsJ - 569:09.19 postgres: bareos bareos SELECT (postgres)
> Mon May 4 13:33:39 CEST 2020
> 770 90206 6145 0 93 0 1335352 227696 - RsJ - 569:28.48 postgres: bareos bareos idle (postgres)
> Mon May 4 13:34:09 CEST 2020
> 770 90206 6145 0 92 0 1337400 228116 - RsJ - 569:47.46 postgres: bareos bareos SELECT (postgres)
> Mon May 4 13:34:39 CEST 2020
> 770 90206 6145 0 92 0 1337400 228596 - RsJ - 570:06.56 postgres: bareos bareos UPDATE (postgres)
> Mon May 4 13:35:09 CEST 2020
> 770 90206 6145 0 92 0 1337400 228944 - RsJ - 570:25.62 postgres: bareos bareos SELECT (postgres)
> Mon May 4 13:35:40 CEST 2020
> 770 90206 6145 0 52 0 1337400 229288 racct DsJ - 570:44.33 postgres: bareos bareos UPDATE (postgres)
> Mon May 4 13:36:10 CEST 2020
> 770 90206 6145 0 91 0 1337400 229952 - RsJ - 571:03.20 postgres: bareos bareos SELECT (postgres)
> Mon May 4 13:36:40 CEST 2020
> 770 90206 6145 0 52 0 1337400 223772 racct DsJ - 571:21.50 postgres: bareos bareos SELECT (postgres)
> Mon May 4 13:37:10 CEST 2020
> 770 90206 6145 0 91 0 1337400 224448 - RsJ - 571:40.63 postgres: bareos bareos idle (postgres)
> Mon May 4 13:37:40 CEST 2020
> 770 90206 6145 0 91 0 1339448 225464 - RsJ - 571:58.36 postgres: bareos bareos SELECT (postgres)
> Mon May 4 13:38:10 CEST 2020
> 770 90206 6145 0 52 0 1339448 215620 select SsJ - 572:14.24 postgres: bareos bareos idle (postgres)
> Mon May 4 13:38:40 CEST 2020
> 770 90206 6145 0 81 0 1339448 215320 - RsJ - 572:21.09 postgres: bareos bareos idle (postgres)
> Mon May 4 13:39:10 CEST 2020
>
>
> OS is FreeBSD 11.3-RELEASE-p8 r360175M i386
> PostgreSQL 12.2 on i386-portbld-freebsd11.3, compiled by gcc9 (FreeBSD Ports Collection) 9.3.0, 32-bit
>
> autovacuum is Disabled.
>
> The memory-specific config is:
> > shared_buffers = 200MB
> > temp_buffers = 40MB
> > work_mem = 80MB
> > maintenance_work_mem = 250MB
> > dynamic_shared_memory_type = posix
> > random_page_cost = 2.0
> > effective_cache_size = 1GB
> (others are left at default)
>
> I remember vaguely that there are means to have a closer look into
> what is using the memory, but do not recall the specifics. Some
> pointers or ideas to proceed would be gladly appreciated (Dtrace
> should work) - processes will usually fail with OOM at this size, due
> to machine configuration - I'm waiting for that now (it is a very very
> old pentium3 machine ;) ).

One idea is that you can attach your process in gdb and call
MemoryContextStats(TopMemoryContext). This will show which context is
using how much memory. So basically u can call this function 2-3
times with some interval and see in which context the memory is
continuously increasing.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2020-05-04 19:55:38 Re: 12.2: Howto check memory-leak in worker?
Previous Message Justin King 2020-05-04 14:09:15 Re: walreceiver termination