Re: High SYS CPU - need advise

From: Vlad <marchenko(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: High SYS CPU - need advise
Date: 2012-11-16 17:19:05
Message-ID: CAKeSUqVuAOTs7vzEkdjDui-=BQPdPv9SD3NigJ4q+BD4C4m_DA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> We're looking for spikes in 'blk' which represents when lwlocks bump.
> If you're not seeing any then this is suggesting a buffer pin related
> issue -- this is also supported by the fact that raising shared
> buffers didn't help. If you're not seeing 'bk's, go ahead and
> disable the stats macro.
>

most blk comes with 0, some with 1, few hitting 100. I can't say that
during stall times the number of blk 0 vs blk non-0 are very different.

> So, what we need to know now is:
> *) What happens when you drastically *lower* shared buffers? Say, to
> 64mb? Note, you may experience higher load for unrelated reasons and
> have to scuttle the test. Also, if you have to crank higher to handle
> internal server structures, do that. This is a hail mary, but maybe
> something interesting spits out.
>

lowering shared_buffers didn't help.

> *) How many specific query plans are needed to introduce the
> condition, Hopefully, it's not too many. If so, let's start
> gathering the plans. If you have a lot of plans to sift through, one
> thing we can attempt to eliminate noise is to tweak
> log_min_duration_statement so that during stall times (only) it logs
> offending queries that are unexpectedly blocking.
>

unfortunately, there are quite a few query plans... also, I don't think
setting log_min_duration_statement will help us, cause when server is
hitting high load average, it reacts slowly even on a key press. So even
non-offending queries will be taking long to execute. I see all sorts of
queries a being executed long during stall: spanning from simple
LOG: duration: 1131.041 ms statement: SELECT 'DBD::Pg ping test'
to complex ones, joining multiple tables.
We are still looking into all the logged queries in attempt to find the
ones that are causing the problem, I'll report if we find any clues.

>
> *) Approximately how big is your 'working set' -- the data your
> queries are routinely hitting?
>

I *think* it's within few hundreds MB range.

>
> *) Is the distribution of the *types* of queries uniform? Or do you
> have special processes that occur on intervals?
>

it's pretty uniform.

>
> Thanks for your patience.
>
>
oh no, thank you for trying to help me to resolve this issue.

-- vlad

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Nicolas Grilly 2012-11-16 17:37:39 Re: Full text search ranking: ordering using index and proximiti ranking with OR queries
Previous Message Mike Blackwell 2012-11-16 17:08:06 Re: Check table storage parameters