Postgres 16 slow "fast" shutdown when using streaming replication

From: Stefan Kohlhauser <chocolatbuddha(at)gmail(dot)com>
To: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Postgres 16 slow "fast" shutdown when using streaming replication
Date: 2024-02-02 11:25:12
Message-ID: CABL5_PHcjNWV+m8PTwq39Zp93tPENGoN=wEJTY-QN4YUc+QzuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hey!

We have a Pacemaker controlled cluster with one Postgres primary and one
standby.
PG 16.1-2PGDG.rhel9.x86_64 taken from
https://download.postgresql.org/pub/repos/yum/16/redhat/rhel-9.2-x86_64/
OS: RHEL 9.2
Linux kamailionode1 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Wed
Apr 12 10:45:03 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

PG is used for a Kamailio SIP server.

On "crashing" (VM power off) the node with the primary PG, the standby PG
is restarted (using "fast" shutdown) and promoted.
However, it seems like the standby is trying to connect to the "crashed"
primary for up to a minute before giving up.

2024-01-30T09:04:45.519+00:00 [...] info: [8-1] 2024-01-30 09:04:45.263 UTC
[1732] LOG: received fast shutdown request
2024-01-30T09:04:45.519+00:00 [...] info: [9-1] 2024-01-30 09:04:45.266 UTC
[1732] LOG: aborting any active transactions
2024-01-30T09:05:41.946+00:00 [...] err: [8-1] 2024-01-30 09:05:41.683 UTC
[2417] FATAL: could not connect to the primary server: connection to
server at "pgreplicationha" (10.40.51.133), port 5432 failed: No route to
host
2024-01-30T09:05:41.946+00:00 [...] err: [8-2] Is the server
running on that host and accepting TCP/IP connections?
2024-01-30T09:05:41.946+00:00 [...] info: [7-1] 2024-01-30 09:05:41.686 UTC
[1736] LOG: shutting down
2024-01-30T09:05:41.946+00:00 [...] info: [10-1] 2024-01-30 09:05:41.691
UTC [1732] LOG: database system is shut down

We previously used PG 12.11 where it worked fine, the shutdown is
immediately. Also with a few quick tests with PG 15.2 we didn't experience
the long shutdown behaviour.

2024-02-02T08:37:13.001+00:00 [...] info: [8-1] 2024-02-02 08:37:12.779 GMT
[2106446] LOG: received fast shutdown request
2024-02-02T08:37:13.001+00:00 [...] info: [9-1] 2024-02-02 08:37:12.781 GMT
[2106446] LOG: aborting any active transactions
2024-02-02T08:37:13.001+00:00 [...] err: [9-1] 2024-02-02 08:37:12.783 GMT
[2106500] FATAL: terminating walreceiver process due to administrator
command
2024-02-02T08:37:13.001+00:00 [...] info: [7-1] 2024-02-02 08:37:12.785 GMT
[2106455] LOG: shutting down
2024-02-02T08:37:13.001+00:00 [...] info: [10-1] 2024-02-02 08:37:12.794
GMT [2106446] LOG: database system is shut down

Sometimes, PG 16 shuts down fast like the previous versions did. But most
of the time it doesn't.

version
----------------------------------------------------------------------------------------------------------
PostgreSQL 16.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1
20230605 (Red Hat 11.4.1-2), 64-bit
(1 row)

name |
current_setting | source
------------------------------+-------------------------------------------------------------------------------+--------------------
application_name | psql
| client
autovacuum | on
| configuration file
client_encoding | UTF8
| client
DateStyle | ISO, MDY
| configuration file
default_text_search_config | pg_catalog.english
| configuration file
dynamic_shared_memory_type | posix
| configuration file
effective_cache_size | 512MB
| configuration file
hot_standby | on
| configuration file
hot_standby_feedback | on
| configuration file
lc_messages | en_US.UTF-8
| configuration file
lc_monetary | en_US.UTF-8
| configuration file
lc_numeric | en_US.UTF-8
| configuration file
lc_time | en_US.UTF-8
| configuration file
listen_addresses | *
| configuration file
log_checkpoints | off
| configuration file
log_connections | on
| configuration file
log_destination | syslog
| configuration file
log_directory | log
| configuration file
log_disconnections | on
| configuration file
log_filename | postgresql-%a.log
| configuration file
log_line_prefix | %m [%p]
| configuration file
log_min_duration_statement | 1s
| configuration file
log_rotation_age | 0
| configuration file
log_rotation_size | 0
| configuration file
log_timezone | Etc/UTC
| configuration file
log_truncate_on_rotation | on
| configuration file
logging_collector | on
| configuration file
max_connections | 400
| configuration file
max_standby_archive_delay | 5s
| configuration file
max_standby_streaming_delay | 5s
| configuration file
max_wal_senders | 16
| configuration file
max_wal_size | 1GB
| configuration file
min_wal_size | 80MB
| configuration file
primary_conninfo | host=pgreplicationha port=5432
user=replicate application_name=kamailionode2 | configuration file
recovery_target_timeline | latest
| configuration file
restart_after_crash | off
| configuration file
shared_buffers | 128MB
| configuration file
ssl | on
| configuration file
ssl_ca_file | root.crt
| configuration file
ssl_cert_file | server.crt
| configuration file
ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL
| configuration file
ssl_key_file | server.key
| configuration file
ssl_min_protocol_version | TLSv1.2
| configuration file
synchronous_commit | off
| configuration file
syslog_facility | local0
| configuration file
syslog_ident | postgres-kamailio
| configuration file
TimeZone | Etc/UTC
| configuration file
track_counts | on
| configuration file
unix_socket_directories | /var/run/postgresql
| configuration file
wal_compression | pglz
| configuration file
wal_keep_size | 256MB
| configuration file
wal_receiver_status_interval | 3s
| configuration file
wal_sender_timeout | 10s
| configuration file
(53 rows)

Any ideas what might be causing this different behaviour on a PG 16?

TIA

Browse pgsql-admin by date

  From Date Subject
Next Message Ron Johnson 2024-02-02 13:54:39 Re: Enhancement Request
Previous Message Bernd Lentes 2024-02-02 09:31:45 RE: Postgres alerting tools?