Re: ??: postgres cpu 100% need help

From: "657985552(at)qq(dot)com" <657985552(at)qq(dot)com>
To: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
Cc: "Bill Moran" <wmoran(at)potentialtech(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: ??: postgres cpu 100% need help
Date: 2015-11-23 06:20:58
Message-ID: 2015112314205736972210@qq.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

hi Merlin:
I'm sorry to reply so late. I don't know exactly the resault :
> number of active queries with average duration:
> SELECT query, count(*), avg(now() - query_start) FROM
> pg_stat_activity where state != 'idle' GROUP BY 1;
there are avg 100 active connect during load even.every query cost 2000 ms+.

how can I Capture backtrace during load even?

[root(at)pg1 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
[root(at)pg1 ~]# cat /sys/kernel/mm/transparent_hugepage/defrag
[always] madvise never

657985552(at)qq(dot)com

From: Merlin Moncure
Date: 2015-11-10 23:55
To: 莎士比亚说:
CC: Bill Moran; pgsql-general
Subject: Re: [GENERAL] ??: postgres cpu 100% need help
On Tue, Nov 10, 2015 at 8:26 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Sun, Nov 8, 2015 at 7:22 AM, 莎士比亚说: <657985552(at)qq(dot)com> wrote:
>> Hi moran and others;
>>
>> yesterday i get the pg problem again, and i use perf top Observation
>> follows:
>> PerfTop: 11574 irqs/sec kernel: 2.2% exact: 0.0% [4000Hz cycles],
>> (all, 32 CPUs)
>> 81.39% postgres [.] s_lock
>> 5.42% postgres [.] LWLockAcquire
>> 4.59% postgres [.] LWLockRelease
>> 3.06% postgres [.] TransactionIdIsInProgress
>> 0.38% postgres [.] PinBuffer
>> 0.31% postgres [.] TransactionIdPrecedes
>> 0.27% postgres [.] UnpinBuffer
>> 0.19% postgres [.] TransactionIdIsCurrentTransactionId
>> 0.16% postgres [.] heap_hot_search_buffer
>> 0.15% [kernel] [k] number.isra.1
>> 0.14% [kernel] [k] kallsyms_expand_symbol.constprop.1
>> 0.10% [kernel] [k] module_get_kallsym
>> 0.10% libc-2.17.so [.] __strcmp_sse42
>> 0.09% [kernel] [k] _raw_spin_lock
>> 0.09% postgres [.] hash_search_with_hash_value
>>
>> is spin lock problem ? I need everyone's help to solve the problem.thsnks!
>
> Yup, spinlock problem. These can be difficult. What weneed now is
> some metrics. They must be captured during load event:
>
> *) number of active queries with average duration:
> SELECT query, count(*), avg(now() - query_start) FROM
> pg_stat_activity where state != 'idle' GROUP BY 1;
>
> *) context switches via "vmstat 1" get a snapshot during load and
> during non load time for comparison
>
> Solution to this problem will probably be one or more of:
> 1) Significantly downgrade shared_buffers (say, to 4GB)
>
> 2) Upgrade database to 9.4 and hope for the best
>
> 3) Capture backtrace during load event to determine exactly which path
> is going into spinlock
>
> 4) Install pgbouncer or another connection pooler to limit active
> queries on database
>
> 5) Install experimental patches to custom compiled database to test
> and verify a hypothetical fix
>
> Out of those 5 things, which are possible for you to do? Best case
> scenario is that you have non-essential server that reproduces the
> issue.

Can we also see output of:

cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Albe Laurenz 2015-11-23 08:46:54 Re: [ADMIN] How to drop stats on table
Previous Message Adrian Klaver 2015-11-22 16:20:37 Re: After configuring remote access,server can't be started