From: | pinker <pinker(at)onet(dot)eu> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: core system is getting unresponsive because over 300 cpu load |
Date: | 2017-10-11 00:26:26 |
Message-ID: | 1507681586390-0.post@n3.nabble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Tomas Vondra-4 wrote
> I'm probably a bit dumb (after all, it's 1AM over here), but can you
> explain the CPU chart? I'd understand percentages (say, 75% CPU used)
> but what do the seconds / fractions mean? E.g. when the system time
> reaches 5 seconds, what does that mean?
hehe, no you've just spotted a mistake, it suppose to be 50 cores :)
out of 80 in total
Tomas Vondra-4 wrote
> Have you tried profiling using perf? That usually identifies hot spots
> pretty quickly - either in PostgreSQL code or in the kernel.
I was always afraid because of overhead, but maybe it's time to start ...
Tomas Vondra-4 wrote
> What I meant is that if the system evicts this amount of buffers all the
> time (i.e. there doesn't seem to be any sudden spike), then it's
> unlikely to be the cause (or related to it).
I was actually been thinking about scenario where different sessions want to
at one time read/write from or to many different relfilenodes, what could
cause page swap between shared buffers and os cache? we see that context
switches on cpu are increasing as well. kernel documentation says that using
page tables instead of Translation Lookaside Buffer (TLB) is very costly and
on some blogs have seen recomendations that using huge pages (so more
addresses can fit in TLB) will help here but postgresql, unlike oracle,
cannot use it for anything else than page buffering (so 16gb) ... so process
memory still needs to use 4k pages.
or memory fragmentation?
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
From | Date | Subject | |
---|---|---|---|
Next Message | pinker | 2017-10-11 00:38:47 | Re: core system is getting unresponsive because over 300 cpu load |
Previous Message | Andres Freund | 2017-10-11 00:18:06 | Re: core system is getting unresponsive because over 300 cpu load |