Re: Waiting on ExclusiveLock on extension

From: Andomar <andomar(at)aule(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Waiting on ExclusiveLock on extension
Date: 2015-04-20 19:05:40
Message-ID: 55354E04.6050107@aule.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Would you be able to get a stack trace of a backend that's holding an
> extension lock? Or maybe perf would provide some insight.
>

The outage occurred again but briefer. There were no ExclusiveLock
messages, presumably because the timeout for logging locks was not
exceeded. But all available connection slots were used and many
incoming requests were denied.

Below you'll find the "perf report" and the "zoomed in" (I suppose
that's the callstack) of the top entry.

The top entry's call stack has these 4 postmaster functions near the top:

PinBuffer
LockRelease
hash_any
HeapTupleSatisfiesMVCC

We'll be rolling back from 9.4.1 to 9.3.6 tonight, hoping that will
resolve the issue.

===== BELOW A 10 SECOND perf top CAPTURE DURING THE OUTAGE

# ========
# captured on: Mon Apr 20 20:34:43 2015
# hostname : db1a
# os release : 2.6.32-504.1.3.el6.x86_64
# perf version : 2.6.32-504.1.3.el6.x86_64.debug
# arch : x86_64
# nrcpus online : 24
# nrcpus avail : 24
# cpudesc : Intel(R) Xeon(R) CPU E5-2667 0 @ 2.90GHz
# cpuid : GenuineIntel,6,45,7
# total memory : 49373964 kB
# cmdline : /usr/bin/perf record -a -o 2015-04-20_20:34:28 sleep 10
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2
= 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1,
precise_ip = 0, attr_mmap2 = 0, attr_mmap = 1, attr_mmap_data = 0
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: uncore_r3qpi_1 = 23, uncore_r3qpi_0 = 22, uncore_r2pcie
= 21, uncore_qpi_1 = 20, uncore_qpi_0 = 19, uncore_imc_3 = 18,
uncore_imc_2 = 17, uncore_imc_1 = 16, uncore_imc_0 = 15, uncore_ha = 14,
uncore_pcu = 13, uncore_cbox_5 = 12, uncore_c
# ========
#
# Samples: 960K of event 'cycles'
# Event count (approx.): 757057831613
#
# Overhead Command Shared Object
Symbol
# ........ ............... .....................
.................................................
#
59.73% postmaster [kernel.kallsyms] [k] compaction_alloc
1.31% postmaster [kernel.kallsyms] [k] _spin_lock
0.94% postmaster [kernel.kallsyms] [k]
__reset_isolation_suitable
0.78% postmaster [kernel.kallsyms] [k] compact_zone
0.67% postmaster [kernel.kallsyms] [k]
get_pageblock_flags_group
0.64% postmaster [kernel.kallsyms] [k] copy_page_c
0.48% :13410 [kernel.kallsyms] [k] compaction_alloc
0.45% :13465 [kernel.kallsyms] [k] compaction_alloc
0.45% postmaster [kernel.kallsyms] [k] clear_page_c
0.44% postmaster postgres [.]
hash_search_with_hash_value
0.41% :13324 [kernel.kallsyms] [k] compaction_alloc
0.40% :13561 [kernel.kallsyms] [k] compaction_alloc
0.38% :13374 [kernel.kallsyms] [k] compaction_alloc
0.37% :13272 [kernel.kallsyms] [k] compaction_alloc
0.37% postmaster [kernel.kallsyms] [k] unmap_vmas
0.36% postmaster [kernel.kallsyms] [k] page_fault
0.36% :13380 [kernel.kallsyms] [k] compaction_alloc
0.35% :13482 [kernel.kallsyms] [k] compaction_alloc
0.34% :13555 [kernel.kallsyms] [k] compaction_alloc
0.34% postmaster [kernel.kallsyms] [k]
set_pageblock_flags_group
0.34% postmaster [kernel.kallsyms] [k] page_check_address
0.33% :13528 [kernel.kallsyms] [k] compaction_alloc
0.33% :13464 [kernel.kallsyms] [k] compaction_alloc
0.31% :13547 [kernel.kallsyms] [k] compaction_alloc
0.30% postmaster [kernel.kallsyms] [k] _spin_lock_irqsave
0.29% :13395 [kernel.kallsyms] [k] compaction_alloc
0.29% :13546 [kernel.kallsyms] [k] compaction_alloc
0.28% postmaster [kernel.kallsyms] [k]
remove_migration_pte
0.28% :13355 [kernel.kallsyms] [k] compaction_alloc
0.28% postmaster [kernel.kallsyms] [k] list_del
0.28% :13432 [kernel.kallsyms] [k] compaction_alloc
0.27% :13258 [kernel.kallsyms] [k] compaction_alloc
0.27% :13328 [kernel.kallsyms] [k] compaction_alloc
0.26% postmaster [kernel.kallsyms] [k] __wake_up_bit
0.26% :13361 [kernel.kallsyms] [k] compaction_alloc
0.25% :13334 [kernel.kallsyms] [k] compaction_alloc
0.25% :13366 [kernel.kallsyms] [k] compaction_alloc
0.25% :13549 [kernel.kallsyms] [k] compaction_alloc
0.25% :13530 [kernel.kallsyms] [k] compaction_alloc
0.24% :13391 [kernel.kallsyms] [k] compaction_alloc
0.24% :13387 [kernel.kallsyms] [k] compaction_alloc
0.24% :13364 [kernel.kallsyms] [k] compaction_alloc
0.24% :13529 [kernel.kallsyms] [k] compaction_alloc
0.24% postmaster [kernel.kallsyms] [k] find_get_page
0.23% postmaster postgres [.] _bt_compare
0.22% :13522 [kernel.kallsyms] [k] compaction_alloc
0.22% :13579 [kernel.kallsyms] [k] compaction_alloc
0.22% :13412 [kernel.kallsyms] [k] compaction_alloc
0.21% postmaster postgres [.] AllocSetAlloc
0.21% :13407 [kernel.kallsyms] [k] compaction_alloc
0.20% :13569 [kernel.kallsyms] [k] compaction_alloc
0.20% :13322 [kernel.kallsyms] [k] compaction_alloc
0.20% :13554 [kernel.kallsyms] [k] compaction_alloc
0.20% :13327 [kernel.kallsyms] [k] compaction_alloc

===== BELOW IS THE ZOOM OF THE TOP ENTRY

Samples: 604K of event 'cycles', Event count (approx.): 476360854137,
Thread: postmaster(11565)
94.92% postmaster [kernel.kallsyms] [k] compaction_alloc
1.49% postmaster [kernel.kallsyms] [k] __reset_isolation_suitable
1.01% postmaster [kernel.kallsyms] [k] copy_page_c
0.54% postmaster [kernel.kallsyms] [k] set_pageblock_flags_group
0.41% postmaster [kernel.kallsyms] [k] __wake_up_bit
0.37% postmaster [kernel.kallsyms] [k] find_get_page
0.16% postmaster [kernel.kallsyms] [k] find_vma
0.16% postmaster libc-2.12.so [.] _int_malloc
0.16% postmaster postgres [.] PinBuffer
–’0.14% postmaster [kernel.kallsyms] [k] __alloc_pages_nodemask
–’0.11% postmaster [kernel.kallsyms] [k] filemap_fault
–’0.08% postmaster postgres [.] LockRelease
–’0.08% postmaster postgres [.] hash_any
–’0.06% postmaster libc-2.12.so [.] _int_free
–’0.06% postmaster [kernel.kallsyms] [k] page_remove_rmap
–’0.03% postmaster postgres [.] HeapTupleSatisfiesMVCC
–’0.03% postmaster [kernel.kallsyms] [k] free_pages_and_swap_cache
–’0.03% postmaster postgres [.] s_lock
–’0.02% postmaster [kernel.kallsyms] [k] unlink_anon_vmas
–’0.02% postmaster [kernel.kallsyms] [k] mem_cgroup_lru_add_list
–’0.02% postmaster postgres [.] ReindexIsProcessingIndex
–’0.01% postmaster [kernel.kallsyms] [k] set_page_dirty
–’0.01% postmaster postgres [.] eqjoinsel
–’0.01% postmaster [kernel.kallsyms] [k] perf_event_aux_ctx
–’0.01% postmaster postgres [.] hash_seq_search
–’0.01% postmaster postgres [.] get_mergejoin_opfamilies
–’0.01% postmaster [kernel.kallsyms] [k] acl_permission_check
–’0.01% postmaster libc-2.12.so [.] __sigsetjmp
–’0.01% postmaster postgres [.] GetUserId
–’0.00% postmaster postgres [.] pull_up_subqueries_recurse
–’0.00% postmaster [kernel.kallsyms] [k] do_brk
–’0.00% postmaster postgres [.] SearchSysCacheList
–’0.00% postmaster postgres [.] join_search_one_level

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2015-04-20 19:43:22 Re: "Cast" SRF returning record to a table type?
Previous Message Andreas Joseph Krogh 2015-04-20 18:48:48 Re: How to keep pg_largeobject from growing endlessly