BDR - DDL Locking

From: Will McCormick <wmccormick(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: BDR - DDL Locking
Date: 2015-10-20 15:31:23
Message-ID: CA+jgkY4uU=_Mtiwqu5kSDUEcqpfADUOS6isZnwe_g+_gc4ba3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

First time user here and new to PostgreSQL and BDR so I hope I have the
right place.

I attempted to issues a TRUNCATE TABLE without the cascade option on a
Parent table that had a child FK constraint.

After this I received:

ERROR: Database is locked against DDL operations
HINT: Node (6203352813534641995,1,16387) in the cluster is already
performing DDL

I'm trying to remove the lock so I:

bms=# select * from pg_locks;
locktype | database | relation | page | tuple | virtualxid |
transactionid | classid | objid | objsubid | virtualtransaction | pid
| mode | granted | fastpath
------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+------+-----------------+---------+----------
relation | 16387 | 11189 | | |
| | | | | 6/58468 | 3049 |
AccessShareLock | t | t
virtualxid | | | | | 6/58468 |
| | | | 6/58468 | 3049 | ExclusiveLock
| t | t

bms=# SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
bms-# ON pl.virtualtransaction = '-1/' || ppx.transaction;
locktype | database | relation | page | tuple | virtualxid |
transactionid | classid | objid | objsubid | virtualtransaction | pid
| mode | granted | fastpath | transaction | gid | prepared |
owner | database
------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------------+---------+----------+-------------+-----+----------+-
------+----------
virtualxid | | | | | 5/861 |
| | | | 5/861 | 20010 | ExclusiveLock
| t | t | | | |
|
relation | 16387 | 11201 | | |
| | | | | 6/58456 | 3049 |
AccessShareLock | t | t | | | |
|
relation | 16387 | 11189 | | |
| | | | | 6/58456 | 3049 |
AccessShareLock | t | t | | | |
|
virtualxid | | | | | 6/58456 |
| | | | 6/58456 | 3049 | ExclusiveLock
| t | t | | | |
|
relation | 0 | 2671 | | |
| | | | | 6/58456 | 3049 |
AccessShareLock | t | f | | | |
|
relation | 0 | 1262 | | |
| | | | | 6/58456 | 3049 |
AccessShareLock | t | f | | | |
|
relation | 0 | 1260 | | |
| | | | | 6/58456 | 3049 |
AccessShareLock | t | f | | | |
|
relation | 0 | 2672 | | |
| | | | | 6/58456 | 3049 |
AccessShareLock | t | f | | | |
|
relation | 0 | 2677 | | |
| | | | | 6/58456 | 3049 |
AccessShareLock | t | f | | | |
|
relation | 0 | 2676 | | |
| | | | | 6/58456 | 3049 |
AccessShareLock | t | f | | | |
|
(10 rows)

bms=# select pg_cancel_backend(20010);
pg_cancel_backend
-------------------
t

bms=# select pg_cancel_backend(3049);
ERROR: canceling statement due to user request

3049 never dies. If I restart postgresql the lock persists.

Server details:

[postgres(at)klw1129 ~/product/9.4.4/logs] psql -U bms -d bms
Password for user bms:
psql (9.4.4)
Type "help" for help.

bms=# select * from bdr.bdr_nodes;
node_sysid | node_timeline | node_dboid | node_status | node_name
| node_local_dsn |
node_init_from_dsn
---------------------+---------------+------------+-------------+-----------+-----------------------------------------------------------------+-----------------------------------------------------------------
6203352813534641995 | 1 | 16387 | r | KLW1128
| port=5432 dbname=bms user=bms password=bms host=192.168.180.228 |
6203352897032163158 | 1 | 16387 | r | KLW1129
| port=5432 dbname=bms user=bms password=bms host=192.168.180.229 |
port=5432 dbname=bms user=bms password=bms host=192.168.180.228

[postgres(at)klw1128 ~/product/9.4.4/logs] psql -U bms -d bms
Password for user bms:
psql (9.4.4)
Type "help" for help.

bms=# select * from bdr.bdr_nodes;
node_sysid | node_timeline | node_dboid | node_status | node_name
| node_local_dsn |
node_init_from_dsn
---------------------+---------------+------------+-------------+-----------+-----------------------------------------------------------------+-----------------------------------------------------------------
6203352813534641995 | 1 | 16387 | r | KLW1128
| port=5432 dbname=bms user=bms password=bms host=192.168.180.228 |
6203352897032163158 | 1 | 16387 | r | KLW1129
| port=5432 dbname=bms user=bms password=bms host=192.168.180.229 |
port=5432 dbname=bms user=bms password=bms host=192.168.180.228
(2 rows)

My log file on klw1129 (Target of DDL):

2015-10-20 15:28:11 GMTLOG: starting background worker process "bdr
(6203352897032163158,1,16387,)->bdr (6203352813534641995,1,"
2015-10-20 15:28:11 GMTERROR: cannot truncate a table referenced in a
foreign key constraint
2015-10-20 15:28:11 GMTDETAIL: Table "conflict_child" references
"conflict".
2015-10-20 15:28:11 GMTHINT: Truncate table "conflict_child" at the same
time, or use TRUNCATE ... CASCADE.
2015-10-20 15:28:11 GMTCONTEXT: during DDL replay of ddl statement:
TRUNCATE TABLE ONLY bms.conflict
2015-10-20 15:28:11 GMTLOG: worker process: bdr
(6203352897032163158,1,16387,)->bdr (6203352813534641995,1, (PID 28543)
exited with exit code 1
2015-10-20 15:28:16 GMTLOG: starting background worker process "bdr
(6203352897032163158,1,16387,)->bdr (6203352813534641995,1,"
2015-10-20 15:28:16 GMTERROR: cannot truncate a table referenced in a
foreign key constraint
2015-10-20 15:28:16 GMTDETAIL: Table "conflict_child" references
"conflict".
2015-10-20 15:28:16 GMTHINT: Truncate table "conflict_child" at the same
time, or use TRUNCATE ... CASCADE.
2015-10-20 15:28:16 GMTCONTEXT: during DDL replay of ddl statement:
TRUNCATE TABLE ONLY bms.conflict
2015-10-20 15:28:16 GMTLOG: worker process: bdr
(6203352897032163158,1,16387,)->bdr (6203352813534641995,1, (PID 28546)
exited with exit code 1

My log file on klw1128 (Source of DDL):

2015-10-20 15:26:55 GMTLOG: starting logical decoding for slot
"bdr_16387_6203352897032163158_1_16387__"
2015-10-20 15:26:55 GMTDETAIL: streaming transactions committing after
0/808BBC8, reading WAL from 0/808BB00
2015-10-20 15:26:55 GMTLOG: logical decoding found consistent point at
0/808BB00
2015-10-20 15:26:55 GMTDETAIL: There are no running transactions.
2015-10-20 15:26:55 GMTLOG: could not receive data from client: Connection
reset by peer
2015-10-20 15:26:55 GMTLOG: unexpected EOF on standby connection

[postgres(at)klw1129 ~/product/9.4.4/logs] psql -U bms -d bms
Password for user bms:
psql (9.4.4)
Type "help" for help.

bms=# select * from pg_extension;
extname | extowner | extnamespace | extrelocatable | extversion |
extconfig |
extcondition
------------+----------+--------------+----------------+------------+---------------------------------------------------------------+---------------------------------
plpgsql | 10 | 11 | f | 1.0 |
|
btree_gist | 16385 | 2200 | t | 1.0 |
|
bdr | 16385 | 11 | f | 0.9.2.0 |
{16919,16934,16947,16983,17013,17023,17032,17039,17052,17121} |
{"","","","","","","","","",""}
(3 rows)

How do I recover from this scenario without a rebuild?
If I cannot recover from this scenario what caused this?

I'm currently testing postgreSQL and BDR for productization.

Thanks,

Will McCormick

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2015-10-20 16:26:34 Re: RAID and SSD configuration question
Previous Message Tomas Vondra 2015-10-20 15:25:27 Re: RAID and SSD configuration question