From: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY) |
Date: | 2014-05-17 18:41:37 |
Message-ID: | 5377AD61.70707@fuzzy.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 17.5.2014 19:55, Tom Lane wrote:
> Tomas Vondra <tv(at)fuzzy(dot)cz> writes:
>> ... then of course the usual 'terminating connection because of crash of
>> another server process' warning. Apparently, it's getting killed by the
>> OOM killer, because it exhausts all the memory assigned to that VM (2GB).
>
> Can you fix things so it runs into its process ulimit before the OOM killer
> triggers? Then we'd get a memory map dumped to stderr, which would be
> helpful in localizing the problem.
I did this in /etc/security/limits.d/80-pgbuild.conf:
pgbuild hard as 1835008
so the user the buildfarm runs under will have up to ~1.75GB of RAM (of
the 2GB available to the container).
>
>> ... So this seems like a
>> memory leak somewhere in the cache invalidation code.
>
> Smells that way to me too, but let's get some more evidence.
The tests are already running, and there are a few postgres processes:
PID VIRT RES %CPU TIME+ COMMAND
11478 449m 240m 100.0 112:53.57 postgres: pgbuild regression [local]
CREATE VIEW
11423 219m 19m 0.0 0:00.17 postgres: checkpointer process
11424 219m 2880 0.0 0:00.05 postgres: writer process
11425 219m 5920 0.0 0:00.12 postgres: wal writer process
11426 219m 2708 0.0 0:00.05 postgres: autovacuum launcher process
11427 79544 1836 0.0 0:00.17 postgres: stats collector process
11479 1198m 1.0g 0.0 91:09.99 postgres: pgbuild regression [local]
CREATE INDEX waiting
Attached is 'pmap -x' output for the two interesting processes (11478,
11479).
Tomas
Attachment | Content-Type | Size |
---|---|---|
11478.pmap.log | text/x-log | 9.6 KB |
11479.pmap.log | text/x-log | 9.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-05-17 19:29:02 | Re: %d in log_line_prefix doesn't work for bg/autovacuum workers |
Previous Message | Andrew Dunstan | 2014-05-17 17:58:27 | Re: buildfarm animals and 'snapshot too old' |