From: | postgresql(dot)org(at)gclough(dot)com |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #13770: Extending recovery_min_apply_delay on Standby causes it to be unavailable for a while |
Date: | 2015-11-11 15:50:06 |
Message-ID: | 20151111155006.2644.84564@wrigleys.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 13770
Logged by: Greg Clough
Email address: postgresql(dot)org(at)gclough(dot)com
PostgreSQL version: 9.4.5
Operating system: Oracle Linux v6.7
Description:
If you extend the recovery_min_apply_delay on a standby and then restart it,
you cannot connect to it for read-only transactions until the most recent
transaction time >= delay time. This is problematic if you extend from say
1 hour to 24 hours, as you have to wait almost a day before you can connect
to your Standby again.
I can't find anything in the documentation to describe this as intended
behaviour, and my expectation was that when I extended the delay that the
Standby database would come up and be available for read-only transactions,
but would simply delay any new transactions until they were older than
recovery_min_apply_delay.
I'm curious if this was a design decision to stop people getting confused
when they access a database before recovery_min_apply_delay has expired and
get data that's too recent, or if it was an unintentional consequence of the
way it was implemented.
Environment:
Oracle Linux v6.7
PostgreSQL v9.4.5 (Installed via yum)
postgres(at)gclough[5432] ~$ uname -a
Linux gclough 3.8.13-98.4.1.el6uek.x86_64 #2 SMP Wed Sep 23 18:46:01 PDT
2015 x86_64 x86_64 x86_64 GNU/Linux
postgres(at)gclough[5432] ~$ psql --version
psql (PostgreSQL) 9.4.5
Test Case:
1. Create a Primary database on port 5432
export PGDATA=/u01/postgres/9.4/5432/data
export PGPORT=5432
initdb
wal_level = hot_standby # minimal, archive, hot_standby, or
logical
max_wal_senders = 1 # max number of walsender processes
2. Replicate it to a Standby database on port 5433
export PGDATA=/u01/postgres/9.4/5433/data
export PGPORT=5433
port = 5433 # (change requires restart)
wal_level = hot_standby # minimal, archive, hot_standby, or
logical
max_wal_senders = 1 # max number of walsender processes
hot_standby = on # "on" allows queries during
recovery
3. On the Standby (Port 5433), introduce a 2min time delay with
"recovery_min_apply_delay" in the recovery.conf:
standby_mode = 'on'
primary_conninfo = 'port=5432 host=localhost user=postgres
application_name=port_5433'
recovery_min_apply_delay = 2min
4. Check the current lag on the Standby (Port 5433):
postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
pg_last_xlog_receive_location() "receive_xlog",
pg_last_xlog_replay_location() "replay_xlog",
pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
Wed Nov 11 15:32:32 GMT 2015
receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
--------------+-------------+-------------------------------+------------
0/7000000 | 0/7000000 | | t
(1 row)
5. Create a small test table on the Primary (Port 5432):
CREATE TABLE delay_test (id NUMERIC);
INSERT INTO delay_test VALUES (1);
6. Check the application of WAL on the Standby (Port 5433), and note the
delay:
postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
pg_last_xlog_receive_location() "receive_xlog",
pg_last_xlog_replay_location() "replay_xlog",
pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
Wed Nov 11 15:33:06 GMT 2015
receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
--------------+-------------+-------------------------------+------------
0/7015DC8 | 0/7015890 | | t
(1 row)
postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
pg_last_xlog_receive_location() "receive_xlog",
pg_last_xlog_replay_location() "replay_xlog",
pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
Wed Nov 11 15:34:00 GMT 2015
receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
--------------+-------------+-------------------------------+------------
0/7015E00 | 0/7015890 | | t
(1 row)
postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
pg_last_xlog_receive_location() "receive_xlog",
pg_last_xlog_replay_location() "replay_xlog",
pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
Wed Nov 11 15:35:01 GMT 2015
receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
--------------+-------------+-------------------------------+------------
0/7015E00 | 0/7015E00 | 2015-11-11 15:32:53.950376+00 | t
(1 row)
7. Reset the time delay on the Standby from 2 to 5 minutes
recovery_min_apply_delay = 5min
pg_ctl restart -m fast
8. Try to connect to the Standby, and get a failure:
postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
pg_last_xlog_receive_location() "receive_xlog",
pg_last_xlog_replay_location() "replay_xlog",
pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
Wed Nov 11 15:35:26 GMT 2015
psql: FATAL: the database system is starting up
postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
pg_last_xlog_receive_location() "receive_xlog",
pg_last_xlog_replay_location() "replay_xlog",
pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
Wed Nov 11 15:36:01 GMT 2015
psql: FATAL: the database system is starting up
postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
pg_last_xlog_receive_location() "receive_xlog",
pg_last_xlog_replay_location() "replay_xlog",
pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
Wed Nov 11 15:37:00 GMT 2015
psql: FATAL: the database system is starting up
9. Wait 3 minutes... then it works:
postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
pg_last_xlog_receive_location() "receive_xlog",
pg_last_xlog_replay_location() "replay_xlog",
pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
Wed Nov 11 15:38:00 GMT 2015
receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
--------------+-------------+-------------------------------+------------
0/7015ED8 | 0/7015ED8 | 2015-11-11 15:32:53.950376+00 | t
(1 row)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2015-11-11 16:18:35 | Re: BUG #13768: JSONB concat |
Previous Message | jhirsch | 2015-11-11 15:48:02 | Re: BUG #13622: unsafe use of relative rpath libpq.5.dylib |