hung backends stuck in spinlock heavy endless loop

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: hung backends stuck in spinlock heavy endless loop
Date: 2015-01-13 22:29:51
Message-ID: CAHyXU0x5mW-SbSuUBEshzumOaN7JPUWa7Ejza68HE-KY0Nq7Kg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On my workstation today (running vanilla 9.4.0) I was testing some new
code that does aggressive parallel loading to a couple of tables. It
ran ok several dozen times and froze up with no external trigger.
There were at most 8 active backends that were stuck (the loader is
threaded to a cap) -- each query typically resolves in a few seconds
but they were hung for 30 minutes+. Had to do restart immediate as
backends were not responding to cancel...but I snapped a 'perf top'
before I did so. The results were interesting so I'm posting them
here. So far I have not been able to reproduce...FYI

61.03% postgres [.] s_lock
13.56% postgres [.] LWLockRelease
10.11% postgres [.] LWLockAcquire
4.02% perf [.] 0x00000000000526d3
1.65% postgres [.] _bt_compare
1.60% libc-2.17.so [.] 0x0000000000081069
0.66% [kernel] [k] kallsyms_expand_symbol.constprop.1
0.60% [kernel] [k] format_decode
0.57% [kernel] [k] number.isra.1
0.47% [kernel] [k] memcpy
0.44% postgres [.] ReleaseAndReadBuffer
0.44% postgres [.] FunctionCall2Coll
0.41% [kernel] [k] vsnprintf
0.41% [kernel] [k] module_get_kallsym
0.32% postgres [.] _bt_relandgetbuf
0.31% [kernel] [k] string.isra.5
0.31% [kernel] [k] strnlen
0.31% postgres [.] _bt_moveright
0.28% libc-2.17.so [.] getdelim
0.22% postgres [.] LockBuffer
0.16% [kernel] [k] seq_read
0.16% libc-2.17.so [.] __libc_calloc
0.13% postgres [.] _bt_checkpage
0.09% [kernel] [k] pointer.isra.15
0.09% [kernel] [k] update_iter
0.08% plugin_host [.] PyObject_GetAttr
0.06% [kernel] [k] strlcpy
0.06% [kernel] [k] seq_vprintf
0.06% [kernel] [k] copy_user_enhanced_fast_string
0.06% libc-2.17.so [.] _IO_feof
0.06% postgres [.] btoidcmp
0.06% [kernel] [k] page_fault
0.06% libc-2.17.so [.] free
0.06% libc-2.17.so [.] memchr
0.06% libpthread-2.17.so [.] __pthread_mutex_unlock_usercnt

merlin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-01-13 22:33:30 Re: hung backends stuck in spinlock heavy endless loop
Previous Message Peter Eisentraut 2015-01-13 22:07:47 Re: pg_rewind in contrib