Quick Links

Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)

From:	Tomas Vondra <tv(at)fuzzy(dot)cz>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)
Date:	2014-05-17 20:33:31
Message-ID:	5377C79B.7040708@fuzzy.cz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 17.5.2014 21:31, Andres Freund wrote:
> On 2014-05-17 20:41:37 +0200, Tomas Vondra wrote:
>> On 17.5.2014 19:55, Tom Lane wrote:
>>> Tomas Vondra <tv(at)fuzzy(dot)cz> writes:
>> The tests are already running, and there are a few postgres processes:
>>
>> PID VIRT RES %CPU TIME+ COMMAND
>> 11478 449m 240m 100.0 112:53.57 postgres: pgbuild regression [local]
>> CREATE VIEW
>> 11423 219m 19m 0.0 0:00.17 postgres: checkpointer process
>> 11424 219m 2880 0.0 0:00.05 postgres: writer process
>> 11425 219m 5920 0.0 0:00.12 postgres: wal writer process
>> 11426 219m 2708 0.0 0:00.05 postgres: autovacuum launcher process
>> 11427 79544 1836 0.0 0:00.17 postgres: stats collector process
>> 11479 1198m 1.0g 0.0 91:09.99 postgres: pgbuild regression [local]
>> CREATE INDEX waiting
>>
>> Attached is 'pmap -x' output for the two interesting processes (11478,
>> 11479).
>
> Could you gdb -p 11479 into the process and issue 'p
> MemoryContextStats(TopMemoryContext)'. That should print information
> about the server's allocation to its stderr.

That process already finished, but I've done that for another process
(that had ~400MB allocated, and was growing steadily - about 1MB / 10
seconds). It was running a SELECT query and already completed.

Then it executed ANALYZE, and I took several snapshots of it - not sure
how much memory it had at the beginning (maybe ~250MB), then when it had
~350MB, 400MB, 500MB and 600MB. It's still running, not sure how much
will it grow.

Anyway, the main difference between the analyze snapshot seems to be this:

init: CacheMemoryContext: 67100672 total in 17 blocks; ...
350MB: CacheMemoryContext: 134209536 total in 25 blocks; ...
400MB: CacheMemoryContext: 192929792 total in 32 blocks; ...
500MB: CacheMemoryContext: 293593088 total in 44 blocks; ...
600MB: CacheMemoryContext: 411033600 total in 58 blocks; ...

Not sure if there's something wrong with the SELECT memory context. It
has ~1500 of nested nodes like these:

SQL function data: 24576 total in 2 blocks; ...
ExecutorState: 24576 total in 2 blocks; ...
SQL function data: 24576 total in 2 blocks; ...
ExprContext: 8192 total in 1 blocks; ...

But maybe it's expected / OK.

regards
Tomas

Attachment	Content-Type	Size
select.log.gz	application/gzip	83.1 KB
analyze.log	text/x-log	3.0 KB
analyze-350MB.log	text/x-log	2.9 KB
analyze-400MB.log	text/x-log	2.9 KB
analyze-500MB.log	text/x-log	2.9 KB
analyze-600MB.log	text/x-log	2.9 KB

In response to

Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY) at 2014-05-17 19:31:40 from Andres Freund

Responses

Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY) at 2014-05-17 20:55:14 from Tomas Vondra
Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY) at 2014-05-17 21:09:10 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2014-05-17 20:34:04	Re: %d in log_line_prefix doesn't work for bg/autovacuum workers
Previous Message	Tom Lane	2014-05-17 20:23:26	Re: %d in log_line_prefix doesn't work for bg/autovacuum workers