pg_prewarm bgworker could break fast shutdown

From: Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_prewarm bgworker could break fast shutdown
Date: 2020-10-28 19:43:48
Message-ID: CAFh8B==je3zJ77KiBeTR4q5+O_3uCxYah-nEehPkmR4JYg-LYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I the fast shutdown was initiated before pg_prewarm managed to load
buffers from the dump (and start the main loop), the pg_prewarm
bgworker process never exits on SIGTERM and effectively preventing the
clean shutdown of the cluster.

This problem bite me a few times, but yesterday I managed to attach to
the pg_prewarm process and got a stacktrace:
(gdb) bt #0 0x00007f394d788d27 in epoll_wait () from
/lib/x86_64-linux-gnu/libc.so.6 #1 0x000056059d6412f9 in
WaitEventSetWaitBlock (nevents=1, occurred_events=0x7ffc598f2b00,
cur_timeout=-1, set=0x56059f5757d8) at
./build/../src/backend/storage/ipc/latch.c:1048 #2 WaitEventSetWait
(set=set(at)entry=0x56059f5757d8, timeout=timeout(at)entry=-1,
occurred_events=occurred_events(at)entry=0x7ffc598f2b00,
nevents=nevents(at)entry=1,
wait_event_info=wait_event_info(at)entry=134217728) at
./build/../src/backend/storage/ipc/latch.c:1000 #3 0x000056059d641748
in WaitLatchOrSocket (latch=0x7f393ec32164,
wakeEvents=wakeEvents(at)entry=17, sock=sock(at)entry=-1, timeout=-1,
timeout(at)entry=0, wait_event_info=wait_event_info(at)entry=134217728) at
./build/../src/backend/storage/ipc/latch.c:385 #4 0x000056059d641805
in WaitLatch (latch=<optimized out>, wakeEvents=wakeEvents(at)entry=17,
timeout=timeout(at)entry=0,
wait_event_info=wait_event_info(at)entry=134217728) at
./build/../src/backend/storage/ipc/latch.c:339 #5 0x000056059d5e1d40
in WaitForBackgroundWorkerShutdown (handle=0x56059f57e9b0) at
./build/../src/backend/postmaster/bgworker.c:1153 #6
0x00007f3944e1a180 in apw_start_database_worker () at
./build/../contrib/pg_prewarm/autoprewarm.c:866 #7 0x00007f3944e1a739
in apw_load_buffers () at
./build/../contrib/pg_prewarm/autoprewarm.c:404 #8 autoprewarm_main
(main_arg=<optimized out>) at
./build/../contrib/pg_prewarm/autoprewarm.c:203 #9 0x000056059d5e16ee
in StartBackgroundWorker () at
./build/../src/backend/postmaster/bgworker.c:834 #10
0x000056059d5ed58c in do_start_bgworker (rw=0x56059f56cd10) at
./build/../src/backend/postmaster/postmaster.c:5713 #11
maybe_start_bgworkers () at
./build/../src/backend/postmaster/postmaster.c:5939 #12
0x000056059d5ee02d in sigusr1_handler (postgres_signal_arg=<optimized
out>) at ./build/../src/backend/postmaster/postmaster.c:5086 #13
<signal handler called> #14 0x00007f394d77e0f7 in select () from
/lib/x86_64-linux-gnu/libc.so.6 #15 0x000056059d5ee58b in ServerLoop
() at ./build/../src/backend/postmaster/postmaster.c:1671 #16
0x000056059d5f038d in PostmasterMain (argc=17, argv=0x56059f51a080) at
./build/../src/backend/postmaster/postmaster.c:1380 #17
0x000056059d37a992 in main (argc=17, argv=0x56059f51a080) at
./build/../src/backend/main/main.c:228

It has happened on 11.9, but after looking at HEAD I think the problem
still exists.

Regards,
--
Alexander Kukushkin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-10-28 19:58:10 Re: libpq compression
Previous Message Alexander Korotkov 2020-10-28 19:36:39 Re: MultiXact\SLRU buffers configuration