Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown

From: Joseph B <joseph(dot)bylund(at)gmail(dot)com>
To: David Kohn <djk447(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown
Date: 2018-03-16 20:30:53
Message-ID: CAAJ=Jc1N+saY6yqqdL+8_7foCFQzSwNsfZE=S8Yi6=3m6Q87RA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Mar 16, 2018 at 2:42 PM, David Kohn <djk447(at)gmail(dot)com> wrote:

> Sorry for the delay on this.
>
>>
>> You can get a backtrace from a running program with by connecting to
>> it with gdb -p PID, then bt for the backtrace. You might need to
>> install the symbols package if you only see addresses (on debianoid
>> systems postgresql-10-dbgsym, not sure what it's called on RHELish
>> systems).
>>
>> https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_
>> a_running_PostgreSQL_backend_on_Linux/BSD
>>
>> here's a backtrace from one of the running pids:
> ```
> #0 0x00007f699037c9f3 in __epoll_wait_nocancel () at
> ../sysdeps/unix/syscall-template.S:84
> #1 0x00005639b9d28c61 in WaitEventSetWaitBlock (nevents=1,
> occurred_events=0x7ffd43678d90, cur_timeout=-1, set=0x5639bbef83c8) at
> /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/storage/ipc/latch.c:1048
> #2 WaitEventSetWait (set=set(at)entry=0x5639bbef83c8, timeout=timeout(at)entry=-1,
> occurred_events=occurred_events(at)entry=0x7ffd43678d90,
> nevents=nevents(at)entry=1, wait_event_info=wait_event_info(at)entry=134217728)
> at /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/storage/ipc/latch.c:1000
> #3 0x00005639b9d290d4 in WaitLatchOrSocket (latch=0x7f697e506f54,
> wakeEvents=wakeEvents(at)entry=17, sock=sock(at)entry=-1, timeout=-1,
> timeout(at)entry=0, wait_event_info=wait_event_info(at)entry=134217728)
> at /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/storage/ipc/latch.c:385
> #4 0x00005639b9d29185 in WaitLatch (latch=<optimized out>,
> wakeEvents=wakeEvents(at)entry=17, timeout=timeout(at)entry=0,
> wait_event_info=wait_event_info(at)entry=134217728)
> at /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/storage/ipc/latch.c:339
> #5 0x00005639b9cccf1b in WaitForBackgroundWorkerShutdown
> (handle=0x5639bbe8aaf0) at /build/postgresql-10-drhiey/
> postgresql-10-10.3/build/../src/backend/postmaster/bgworker.c:1154
> #6 0x00005639b9af36fd in WaitForParallelWorkersToExit
> (pcxt=0x5639bbe8a118, pcxt=0x5639bbe8a118) at /build/postgresql-10-drhiey/
> postgresql-10-10.3/build/../src/backend/access/transam/parallel.c:655
> #7 0x00005639b9af4417 in DestroyParallelContext (pcxt=0x5639bbe8a118) at
> /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/access/transam/parallel.c:737
> #8 0x00005639b9af4a28 in AtEOXact_Parallel (isCommit=isCommit(at)entry=0
> '\000') at /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/access/transam/parallel.c:1006
> #9 0x00005639b9affde7 in AbortTransaction () at
> /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/access/transam/xact.c:2538
> #10 0x00005639b9b00545 in AbortCurrentTransaction () at
> /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/access/transam/xact.c:3097
> #11 0x00005639b9d4bd6d in PostgresMain (argc=1, argv=argv(at)entry=0x5639bbe9ae40,
> dbname=0x5639bbe9ad58 "marjory", username=0x5639bbe39a08 "reporter") at
> /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/tcop/postgres.c:3879
> #12 0x00005639b9a850d9 in BackendRun (port=0x5639bbe978f0) at
> /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/postmaster/postmaster.c:4405
> #13 BackendStartup (port=0x5639bbe978f0) at /build/postgresql-10-drhiey/
> postgresql-10-10.3/build/../src/backend/postmaster/postmaster.c:4077
> #14 ServerLoop () at /build/postgresql-10-drhiey/
> postgresql-10-10.3/build/../src/backend/postmaster/postmaster.c:1755
> #15 0x00005639b9cdb78b in PostmasterMain (argc=5, argv=<optimized out>) at
> /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/postmaster/postmaster.c:1363
> #16 0x00005639b9a864d5 in main (argc=5, argv=0x5639bbe37850) at
> /build/postgresql-10-drhiey/postgresql-10-10.3/build/../
> src/backend/main/main.c:228
> ```
> Joe, cc'd here is a colleague who will be able to help out in future.
>
> Thanks for all the help on this,
> David
>
>
I think this should be the other part of the equation:

#0 0x00007f699037c9f3 in __epoll_wait_nocancel () at
../sysdeps/unix/syscall-template.S:84
#1 0x00005639b9d28c61 in WaitEventSetWaitBlock (nevents=1,
occurred_events=0x7ffd43678540, cur_timeout=-1, set=0x5639bbe3a388) at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/storage/ipc/latch.c:1048
#2 WaitEventSetWait (set=set(at)entry=0x5639bbe3a388, timeout=timeout(at)entry=-1,
occurred_events=occurred_events(at)entry=0x7ffd43678540, nevents=nevents(at)entry=1,
wait_event_info=wait_event_info(at)entry=134217735)
at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/storage/ipc/latch.c:1000
#3 0x00005639b9d290d4 in WaitLatchOrSocket (latch=0x7f697e544ea4,
wakeEvents=wakeEvents(at)entry=1, sock=sock(at)entry=-1, timeout=-1, timeout(at)entry=0,
wait_event_info=wait_event_info(at)entry=134217735)
at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/storage/ipc/latch.c:385
#4 0x00005639b9d29185 in WaitLatch (latch=<optimized out>,
wakeEvents=wakeEvents(at)entry=1, timeout=timeout(at)entry=0,
wait_event_info=wait_event_info(at)entry=134217735)
at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/storage/ipc/latch.c:339
#5 0x00005639b9c55580 in mq_putmessage (msgtype=69 'E', s=<optimized out>,
len=<optimized out>) at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/libpq/pqmq.c:171
#6 0x00005639b9c54d44 in pq_endmessage (buf=buf(at)entry=0x7ffd43678670) at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/libpq/pqformat.c:347
#7 0x00005639b9e5d68c in send_message_to_frontend (edata=<optimized out>)
at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/utils/error/elog.c:3314
#8 EmitErrorReport () at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/utils/error/elog.c:1483
#9 0x00005639b9ccc826 in StartBackgroundWorker () at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/postmaster/bgworker.c:779
#10 0x00005639b9cd96cb in do_start_bgworker (rw=<optimized out>) at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/postmaster/postmaster.c:5728
#11 maybe_start_bgworkers () at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/postmaster/postmaster.c:5941
#12 0x00005639b9cda385 in sigusr1_handler (postgres_signal_arg=<optimized
out>) at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/postmaster/postmaster.c:5121
#13 <signal handler called>
#14 0x00007f69903725b3 in __select_nocancel () at
../sysdeps/unix/syscall-template.S:84
#15 0x00005639b9a8468c in ServerLoop () at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/postmaster/postmaster.c:1719
#16 0x00005639b9cdb78b in PostmasterMain (argc=5, argv=<optimized out>) at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/postmaster/postmaster.c:1363
#17 0x00005639b9a864d5 in main (argc=5, argv=0x5639bbe37850) at
/build/postgresql-10-drhiey/postgresql-10-10.3/build/../src/backend/main/main.c:228

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2018-03-16 20:49:29 BUG #15117: Duplicate Primary Key
Previous Message David Kohn 2018-03-16 18:42:21 Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown