From: | Vladimir Borodin <root(at)simply(dot)name> |
---|---|
To: | pgsql-admin <pgsql-admin(at)postgresql(dot)org> |
Subject: | Replication slots and isolation levels |
Date: | 2015-10-27 16:45:58 |
Message-ID: | 7F74C5EA-6741-44FC-B6C6-E96F18D761FB@simply.name |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin pgsql-hackers |
Hi all.
I’m wondering why do I get conflicts with recovery on hot standby using replication slots and read commited isolation level? And if I start repeatable read transaction I don’t get any errors. Below is some diagnostics.
I’m using 9.4.4 (upgrade is planned) from yum.postgresql.org <http://yum.postgresql.org/> packages on both master and standby. Configs are the same on both master and standby:
rpopdb01d/postgres M # SELECT name, setting FROM pg_settings
WHERE category LIKE 'Replication%' or category LIKE 'Write-Ahead Log';
name | setting
------------------------------+---------
hot_standby | on
hot_standby_feedback | on
max_replication_slots | 1
max_standby_archive_delay | 30000
max_standby_streaming_delay | 30000
max_wal_senders | 10
synchronous_standby_names |
vacuum_defer_cleanup_age | 200000
wal_keep_segments | 64
wal_receiver_status_interval | 1
wal_receiver_timeout | 60000
wal_sender_timeout | 3000
(12 rows)
Time: 1.583 ms
rpopdb01d/postgres M #
On the master I’ve created a physical replication slot and attached standby to it, I do see changing xmin and restart_lsn fields in pg_replication_slots view.
rpopdb01d/postgres M # select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn
----------------------+--------+-----------+--------+----------+--------+------------+--------------+---------------
rpopdb01e_domain_com | [null] | physical | [null] | [null] | t | 2127399287 | [null] | 960B/415C79C8
(1 row)
Time: 0.463 ms
rpopdb01d/postgres M #
When I start a read commited transaction on standby (or use autocommit mode, doesn’t matter) I still see that xmin in pg_replication_slots view on master increases. If I do run a heavy SELECT statement, at some point of time (presumably after vacuum_defer_cleanup_age expires) standby starts to lag replication apply and when it hits max_standby_streaming_delay I get 40001 sql code, either ERROR or FATAL:
rpopdb01e/rpopdb R # SHOW transaction_isolation ;
transaction_isolation
-----------------------
read committed
(1 row)
Time: 0.324 ms
rpopdb01e/rpopdb R # SELECT count(*) FROM big_table;
ERROR: 40001: canceling statement due to conflict with recovery
DETAIL: User query might have needed to see row versions that must be removed.
LOCATION: ProcessInterrupts, postgres.c:2990
Time: 199791.339 ms
rpopdb01e/rpopdb R #
rpopdb01e/rpopdb R # SHOW transaction_isolation ;
transaction_isolation
-----------------------
read committed
(1 row)
Time: 0.258 ms
rpopdb01e/rpopdb R # BEGIN;
BEGIN
Time: 0.067 ms
rpopdb01e/rpopdb R # SELECT count(*) FROM big_table;
FATAL: 40001: terminating connection due to conflict with recovery
DETAIL: User was holding a relation lock for too long.
LOCATION: ProcessInterrupts, postgres.c:2857
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
Time: 307864.830 ms
rpopdb01e/rpopdb R #
The behavior is the same as expected to be without using replication slots.
But when I start repeatable read transaction xmin field in pg_replication_slots view on master freezes (while restart_lsn is still increasing) and I don’t get any replication lag and conflicts with recovery. When I end this transaction, xmin starts increasing again.
rpopdb01e/rpopdb R # begin transaction isolation level repeatable read;
BEGIN
Time: 0.118 ms
rpopdb01e/rpopdb R # SELECT count(*) FROM big_table;
count
------------
3106222429
(1 row)
Time: 411944.889 ms
rpopdb01e/rpopdb R # ROLLBACK;
ROLLBACK
Time: 0.269 ms
rpopdb01e/rpopdb R #
And that is what I expect. Am I missing something or is it expected behavior in read commited mode?
Thanks in advance.
--
May the force be with you…
https://simply.name
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Steben | 2015-10-27 18:08:46 | pg_tblspc and fsm files |
Previous Message | Magnus Hagander | 2015-10-24 11:51:45 | Re: pg_basebackup Error could not open directory |
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2015-10-27 17:04:55 | Re: fortnight interval support |
Previous Message | Alvaro Herrera | 2015-10-27 16:27:03 | Re: Duplicated assignment of slot_name in walsender.c |