BUG #18862: using pg_autoctl node_2 cannot bring back from maintenance mode

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: m(dot)kancevicius(at)gmail(dot)com
Subject: BUG #18862: using pg_autoctl node_2 cannot bring back from maintenance mode
Date: 2025-03-24 15:28:31
Message-ID: 18862-b641a5dac163d5ca@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18862
Logged by: mindaugas kancevicius
Email address: m(dot)kancevicius(at)gmail(dot)com
PostgreSQL version: 16.7
Operating system: RHEL 8.10
Description:

Hello,
I am using pg_auto_failover postgresql version 16.7, RHEL 8.10:
monitor node
node_1
node_2
When applying config changes in node_1, enabling maintenance in node_2:
pg_autoctl enable maintenance
When configuration completed in node_1, disabling maintenance:
pg_autoctl enable maintenance
This command worked fine for 5-7 times when did changes and node_2 catchedup
node_1 TLI: LSN.
When applied the last time enable, disable maintenance it somehow frozen and
received an error when tried to comeback from maintenance mode:
Name | Node | Host:Port | TLI: LSN | Connection |
Reported State | Assigned State
-------+-------+-------------------------+------------------+--------------+---------------------+--------------------
node_1 | 1 | node_1:5432 | 4: 97/6C000110 | read-write |
single | single
node_2 | 2 | node_2:5432 | 4: 97/6BD90750 | none ! |
maintenance | catchingup

The last known: TLI: LSN was 97/6BD90338
Name | Node | Host:Port | TLI: LSN | Connection |
Reported State | Assigned State
-------+-------+-------------------------+------------------+--------------+---------------------+--------------------
node_1 | 1 | node_1:5432 | 4: 97/6BD90338 | read-write |
primary | primary
node_2 | 2 | node_2:5432 | 4: 97/6BD90338 | read-only |
secondary | secondary

This happened for me twice in 2 days after several commands when
enabled/disabled maintenance mode.

Is there any known issue why this happens randomly for node_2.
The only wait how i was able to fix it, i had to drop node_2 from monitor
node and reapply setup on node_2.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2025-03-24 18:35:07 BUG #18863: Multixact wraparound and pg_resetwal error "multitransaction ID (-m) must not be 0"
Previous Message Tom Lane 2025-03-21 15:36:14 Re: BUG #18859: ERROR: unexpected plan node type: 356