Re: Postgres Replaying WAL slowly

From: Jeff Frost <jeff(at)pgexperts(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Soni M <diptatapa(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres Replaying WAL slowly
Date: 2014-06-30 19:17:05
Message-ID: 0A7B96D4-DB10-49A8-A268-F28960F2E137@pgexperts.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Jun 30, 2014, at 11:39 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:

> On 2014-06-30 11:34:52 -0700, Jeff Frost wrote:
>> On Jun 30, 2014, at 10:29 AM, Soni M <diptatapa(at)gmail(dot)com> wrote:
>
>>> It is
>>> 96.62% postgres [.] StandbyReleaseLocks
>>> as Jeff said. It runs quite long time, more than 5 minutes i think
>>>
>>> i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load.
>>
>> compiled with -fno-omit-frame-pointer doesn't yield much more info:
>
> You'd need to do perf record -ga instead of perf record -a to see
> additional information.
>

Ah! That's right.

Here's how that looks:

Samples: 473K of event 'cpu-clock', Event count (approx.): 473738
+ 68.42% init [kernel.kallsyms] [k] native_safe_halt
+ 26.07% postgres postgres [.] StandbyReleaseLocks
+ 2.82% swapper [kernel.kallsyms] [k] native_safe_halt
+ 0.19% ssh libcrypto.so.1.0.1e [.] md5_block_asm_data_order
+ 0.19% postgres postgres [.] RecordIsValid
+ 0.16% ssh libcrypto.so.1.0.1e [.] RC4
+ 0.10% postgres postgres [.] hash_search_with_hash_value
+ 0.06% postgres [kernel.kallsyms] [k] _spin_unlock_irqrestore
+ 0.05% init [vmxnet3] [k] vmxnet3_poll_rx_only
+ 0.04% postgres [kernel.kallsyms] [k] copy_user_generic_unrolled
+ 0.04% init [kernel.kallsyms] [k] finish_task_switch
+ 0.04% init [kernel.kallsyms] [k] __do_softirq
+ 0.04% ssh [kernel.kallsyms] [k] _spin_unlock_irqrestore
+ 0.04% ssh [vmxnet3] [k] vmxnet3_xmit_frame
+ 0.03% postgres postgres [.] PinBuffer
+ 0.03% init [vmxnet3] [k] vmxnet3_xmit_frame
+ 0.03% ssh [kernel.kallsyms] [k] copy_user_generic_unrolled
+ 0.03% postgres postgres [.] XLogReadBufferExtended
+ 0.03% ssh ssh [.] 0x000000000002aa07
+ 0.03% init [kernel.kallsyms] [k] _spin_unlock_irqrestore
+ 0.03% ssh [vmxnet3] [k] vmxnet3_poll_rx_only
+ 0.02% ssh [kernel.kallsyms] [k] __do_softirq
+ 0.02% postgres libc-2.12.so [.] _wordcopy_bwd_dest_aligned
+ 0.02% postgres postgres [.] mdnblocks
+ 0.02% ssh libcrypto.so.1.0.1e [.] 0x00000000000e25a1
+ 0.02% scp [kernel.kallsyms] [k] copy_user_generic_unrolled
+ 0.02% ssh libc-2.12.so [.] memcpy
+ 0.02% postgres libc-2.12.so [.] memcpy

> But:
>
>> 76.24% postgres [.] StandbyReleaseLocks
>
> already is quite helpful.
>
> What are you doing on that system? Is there anything requiring large
> amounts of access exclusive locks on the primary? Possibly large amounts
> of temporary relations?

The last time we did a 100% logging run, the peak temp table creation was something like 120k/hr, but the replicas seemed able to keep up with that just fine.

Hopefully Soni can answer whether that has increased significantly since May.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Frost 2014-06-30 19:25:30 Re: Postgres Replaying WAL slowly
Previous Message Pavel Stehule 2014-06-30 18:40:15 Re: Postgres Replaying WAL slowly